Re: [HACKERS] SQL/MED estimated time of arrival?
At 01:36 AM 11/16/2010, Shigeru HANADA wrote: Thanks for the information about Informix VTI. Because I'm not familiar to Informix, I might have missed your point. Would you mind telling me more about Informix VTI? On Mon, 15 Nov 2010 08:45:14 -0800 Eric Davies e...@barrodale.com wrote: With Informix VTI, indexing is the same for native tables as for virtual tables, except the interpretation of the 32 bit rowid is left up to the developer. When you define the VTI class, you optionally supply a method that can fetch data based on a 32 bit rowid, and it's the responsibility of your non-indexed scanning methods to provide rowids along with the row tuple. ISTM that index on a VTI table could be inconsistent when original (remote) data was changed in the way other than VTI. Is it assumed that the data source is never updated without VTI interface? Yes, the data sources are assumed to updated only through the VTI interface. With our UFI product, the data sources are assumed to be unchanging files, you'd need to re-index them if they changed. Having local indexes can be very useful if you have a user that issues queries like: select count(*) from some_external_table where ; With VTI, the count aggregate doesn't get pushed down, meaning that without a local index, your scanning method has to return as many tuples as match the where clause, which can be very slow. How can Informix server optimize such kind of query? Counts the index tuple which match the WHERE clause? That would be my assumption. If so, such optimization seems to be limited to count and wouldn't be able to be useful for max or sum. Or, specialized index or VTI class is responsible to the optimization? If there is an index on the column you want to sum/min/max, and your where clause restricts the query to a particular set of rows based on that index, Informix can get the values for that column from the index (which it needed to scan anyhow) without looking at the table. This isn't particular to VTI, it's just a clever use of indexes. Here is a clipping from one of the Informix manuals on the topic: The way that the optimizer chooses to read a table is called an access plan. The simplest method to access a table is to read it sequentially, which is called a table scan. The optimizer chooses a table scan when most of the table must be read or the table does not have an index that is useful for the query. The optimizer can also choose to access the table by an index. If the column in the index is the same as a column in a filter of the query, the optimizer can use the index to retrieve only the rows that the query requires. The optimizer can use a key-only index scan if the columns requested are within one index on the table. The database server retrieves the needed data from the index and does not access the associated table. Important: The optimizer does not choose a key-only scan for a VARCHAR column. If you want to take advantage of key-only scans, use the ALTER TABLE with the MODFIY clause to change the column to a CHAR data type. The optimizer compares the cost of each plan to determine the best one. The database server derives cost from estimates of the number of I/O operations required, calculations to produce the results, rows accessed, sorting, and so forth. Local indexes also affords the opportunity of using specialized indexes built into the database. My guess is that without some form of rowids being passed back and forth, you couldn't define non-materialized views of virtual tables that could be indexed. That said, we implemented our own btree-like index that used the pushed down predicates because fetching data one row at a time wasn't desirable with our design choices, and we wanted to support virtual tables with more than 4 billion rows. I couldn't see the way to handle virtual table with more than 4 billion rows with 32 bit rowids in local index. Do you mean that your btree-like index searches result rows by predicates directly and skips getbyid()? Exactly. Our own rowids can be up to 64 bits but are never seen by Informix. As far as Informix is concerned, it's a regular table scan because the use of our indexes is hidden. Regards, -- Shigeru Hanada Cheers, Eric. ** Eric Davies, M.Sc. Senior Programmer Analyst Barrodale Computing Services Ltd. 1095 McKenzie Ave., Suite 418 Victoria BC V8P 2L5 Canada Tel: (250) 704-4428 Web: http://www.barrodale.com Email: e...@barrodale.com **
Re: [HACKERS] SQL/MED estimated time of arrival?
With Informix VTI, indexing is the same for native tables as for virtual tables, except the interpretation of the 32 bit rowid is left up to the developer. When you define the VTI class, you optionally supply a method that can fetch data based on a 32 bit rowid, and it's the responsibility of your non-indexed scanning methods to provide rowids along with the row tuple. Having local indexes can be very useful if you have a user that issues queries like: select count(*) from some_external_table where ; With VTI, the count aggregate doesn't get pushed down, meaning that without a local index, your scanning method has to return as many tuples as match the where clause, which can be very slow. Local indexes also affords the opportunity of using specialized indexes built into the database. My guess is that without some form of rowids being passed back and forth, you couldn't define non-materialized views of virtual tables that could be indexed. That said, we implemented our own btree-like index that used the pushed down predicates because fetching data one row at a time wasn't desirable with our design choices, and we wanted to support virtual tables with more than 4 billion rows. Eric At 07:41 PM 11/14/2010, Shigeru HANADA wrote: On Fri, 12 Nov 2010 08:27:54 -0800 Eric Davies e...@barrodale.com wrote: Thank you for the time estimate and the interface discussion. It sounds like the PostgreSQL SQL/MED code will be very useful when it is done. Our product provides read-only access to files, so updates/inserts/deletes aren't an issue for us. One thing that is not clear to me is indexing support. Will it be possible to index a SQL/MED table as if it were a regular table? No, SQL/MED would not support indexing foreign tables, at least in first version. Because it would be difficult to use common row id for various FDWs. To support indexing foreign tables might need to change common structure of index tuple to be able to hold virtual row-id, not ItemPointerData. Instead, FDW can handle expressions which are parsed from WHERE clause and JOIN condition of original SQL, and use them to optimize scanning. For example, FDW for PostgreSQL pushes some conditions down to remote side to decrease result tuples to be transferred. I hope this idea helps you. What would be the equivalent of Informix's row ids? Answer to the second question would be ItemPointerData. It consists of a block number and an offset in the block, and consume 6 bytes for each tuple. With this information, PostgreSQL can access to a data tuple directly. Actual definition is: typedef struct ItemPointerData { BlockIdData ip_blkid; OffsetNumber ip_posid; } ItemPointer; Does Informix uses common row-id (AFAIK it's 4 bytes integer) for both of virtual tables and normal tables? Regards, -- Shigeru Hanada ** Eric Davies, M.Sc. Senior Programmer Analyst Barrodale Computing Services Ltd. 1095 McKenzie Ave., Suite 418 Victoria BC V8P 2L5 Canada Tel: (250) 704-4428 Web: http://www.barrodale.com Email: e...@barrodale.com **
Re: [HACKERS] SQL/MED estimated time of arrival?
Hi Gentlemen, Thank you for the time estimate and the interface discussion. It sounds like the PostgreSQL SQL/MED code will be very useful when it is done. Our product provides read-only access to files, so updates/inserts/deletes aren't an issue for us. One thing that is not clear to me is indexing support. Will it be possible to index a SQL/MED table as if it were a regular table? What would be the equivalent of Informix's row ids? Eric. ** Eric Davies, M.Sc. Senior Programmer Analyst Barrodale Computing Services Ltd. 1095 McKenzie Ave., Suite 418 Victoria BC V8P 2L5 Canada Tel: (250) 704-4428 Web: http://www.barrodale.com Email: e...@barrodale.com **
[HACKERS] SQL/MED estimated time of arrival?
Hi SQL/MED developers, Our company has just finished development of a database extension for Informix that provides tabular access to various types of structured files (NetCDF and HDF5, with more types to come). We would like to port this logic to run on PostgreSQL, since many of our potential customers use PostgreSQL. On Informix, we were able to take advantage of the VTI (Virtual Table Interface) feature to support table scans and indexing. (See http://www.ibm.com/developerworks/data/zones/informix/library/techarticle/db_vti.html .) Do you have any idea of how long it will be before SQL/MED on PostgreSQL will be available, and perhaps how similar it will be to Informix VTI? Thanks, Eric. ** Eric Davies, M.Sc. Senior Programmer Analyst Barrodale Computing Services Ltd. 1095 McKenzie Ave., Suite 418 Victoria BC V8P 2L5 Canada Tel: (250) 704-4428 Web: http://www.barrodale.com Email: e...@barrodale.com **
[HACKERS] mechanism used to store images in Postgres
We're looking at storing the equivalent of very large rasters images in postgres. Our understanding is that postgres stores large objects as little pieces in rows of a table. As our images would be 30MB-4GB in size, this would be a lot of rows. Has anybody used a different mechanism to store their data that offers higher performance? Thank you, Eric. ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])