[HACKERS] Getting the OID inside the executor

2011-09-01 Thread Fazool
Hello everyone,

I am implementing some functionality into Postgresql, where I want to
track which row was accessed by a user query. I am implementing the
functionality inside Postgres, so that there are no changes required
on client side (e.g. re-writing queries).
Rows are identified by OIDs, and I have set default_with_oids=true, so
that all tables are created with OIDs.

Now, when I run a 'select * from my_table', I can easily get the OID
inside the executor because I get a 'HeapTuple' from ExecSan(). I call
HeapTupleGetOid() on the heap tuple and I have the oid.

The problem is that when I run a query with a projection, e.g.,
'select name from my_table', then the scanner returns a null
HeapTuple. How can I get the OID in such a case?

What would be the best way of doing this? Should I modify the planner
to add 'get oid' as a junk filter? Would it affect something else?

Any comments will be highly appreciated.

Thanks!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Getting the OID inside the executor

2011-09-01 Thread Tom Lane
Fazool fazoolm...@gmail.com writes:
 I am implementing some functionality into Postgresql, where I want to
 track which row was accessed by a user query. I am implementing the
 functionality inside Postgres, so that there are no changes required
 on client side (e.g. re-writing queries).
 Rows are identified by OIDs, and I have set default_with_oids=true, so
 that all tables are created with OIDs.

It's a serious, serious error to suppose that OIDs are adequate
identifiers for rows in user tables.

We use OIDs to identify rows in system catalogs.  We can get away with
that, more or less, because (a) system catalogs are not likely to
contain billions of rows, and (b) we place a unique index on OID on
every system catalog that has OIDs.  Neither of these statements is
very tenable for user tables.

You haven't said anything about what it is you actually need to
accomplish here, but can you use TIDs as row identifiers?  What's
the required lifetime of the identifiers?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Getting the OID inside the executor

2011-09-01 Thread Fazool
Thanks for the reply Tom.

As far as I know, the TID of a row can change over time (e.g. when the
row is updated). Isn't it so? If so, it wont solve my problem.

It says here (http://www.postgresql.org/docs/9.0/static/ddl-system-columns.html)
that
The OID, or even better a user-defined serial number, should be used
to identify logical rows., so I thought I can use OID. You are right
as OIDs are 32-bit ints, so they will wrap-around at some point.
The reason why I don't want to use a SERIAL user-column is that I want
the process to be transparent to the user.

What I want accomplished is the following. I want to trace all
accesses (read/update/insert) to rows. If a row was accessed twice, it
should appear twice in the trace. Later (offline), I want to analyze
the trace to find some patterns, and after deducing some results from
the pattern, I should be able to access the original rows in the
database. For example, if my pattern-analysis says that row x is
best, I should be able to read x from the DB. Hence, a tuple should
uniquely be identifiable (have same ID) in the DB and in the trace.

What would you suggest for such a scenario?

Thanks for the help.


On Thu, Sep 1, 2011 at 10:03 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Fazool fazoolm...@gmail.com writes:
 I am implementing some functionality into Postgresql, where I want to
 track which row was accessed by a user query. I am implementing the
 functionality inside Postgres, so that there are no changes required
 on client side (e.g. re-writing queries).
 Rows are identified by OIDs, and I have set default_with_oids=true, so
 that all tables are created with OIDs.

 It's a serious, serious error to suppose that OIDs are adequate
 identifiers for rows in user tables.

 We use OIDs to identify rows in system catalogs.  We can get away with
 that, more or less, because (a) system catalogs are not likely to
 contain billions of rows, and (b) we place a unique index on OID on
 every system catalog that has OIDs.  Neither of these statements is
 very tenable for user tables.

 You haven't said anything about what it is you actually need to
 accomplish here, but can you use TIDs as row identifiers?  What's
 the required lifetime of the identifiers?

                        regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers