Re: [HACKERS] SQL/MED estimated time of arrival?

2010-11-16 Thread Eric Davies

At 01:36 AM 11/16/2010, Shigeru HANADA wrote:

Thanks for the information about Informix VTI.  Because I'm not
familiar to Informix, I might have missed your point.  Would you mind
telling me more about Informix VTI?




On Mon, 15 Nov 2010 08:45:14 -0800
Eric Davies e...@barrodale.com wrote:
 With Informix VTI, indexing is the same for native tables as for
 virtual tables, except the interpretation of the 32 bit rowid is left
 up to the developer. When you define the VTI class, you optionally
 supply a method that can fetch data based on a 32 bit rowid, and it's
 the responsibility of your non-indexed scanning methods to provide
 rowids along with the row tuple.

ISTM that index on a VTI table could be inconsistent when original
(remote) data was changed in the way other than VTI.  Is it assumed
that the data source is never updated without VTI interface?


Yes, the data sources are assumed to updated only through the VTI interface.
With our UFI product, the data sources are assumed to be unchanging 
files, you'd need to re-index them if they changed.




 Having local indexes can be very useful if you have a user that
 issues queries like:
 select count(*) from some_external_table where  ;
 With VTI, the count aggregate doesn't get pushed down, meaning that
 without a local index, your scanning method has to return as many
 tuples as match the where clause, which can be very slow.

How can Informix server optimize such kind of query?  Counts the index
tuple which match the WHERE clause?


That would be my assumption.



 If so, such optimization seems to
be limited to count and wouldn't be able to be useful for max or
sum.  Or, specialized index or VTI class is responsible to the
optimization?


If there is an index on the column you want to sum/min/max, and your 
where clause restricts the query to a particular set of rows based on 
that index, Informix can get the values for that column from the 
index (which it needed to scan anyhow) without looking at the table. 
This isn't particular to VTI, it's just a clever use of indexes.


Here is a clipping from one of the Informix manuals on the topic:
The way that the optimizer chooses to read a table is called an 
access plan. The simplest method to access a table is to read it 
sequentially, which is called a table scan. The optimizer chooses a 
table scan when most of the table must be read or the table does not 
have an index that is useful for the query.
The optimizer can also choose to access the table by an index. If the 
column in the index is the same as a column in a filter of the query, 
the optimizer can use the index to retrieve only the rows that the 
query requires. The optimizer can use a key-only index scan if the 
columns requested are within one index on the table. The database 
server retrieves the needed data from the index and does not access 
the associated table.

Important:
The optimizer does not choose a key-only scan for a VARCHAR column. 
If you want to take advantage of key-only scans, use the ALTER TABLE 
with the MODFIY clause to change the column to a CHAR data type.
The optimizer compares the cost of each plan to determine the best 
one. The database server derives cost from estimates of the number of 
I/O operations required, calculations to produce the results, rows 
accessed, sorting, and so forth.




 Local indexes also affords the opportunity of using specialized
 indexes built into the database. My guess is that without some form
 of rowids being passed back and forth, you couldn't define
 non-materialized views of virtual tables that could be indexed.

 That said, we implemented our own btree-like index that used the
 pushed down predicates because fetching data one row at a time wasn't
 desirable with our design choices, and we wanted to support virtual
 tables with more than 4 billion rows.

I couldn't see the way to handle virtual table with more than 4
billion rows with 32 bit rowids in local index.  Do you mean that your
btree-like index searches result rows by predicates directly and
skips getbyid()?


Exactly. Our own rowids can be up to 64 bits but are never seen by 
Informix. As far as Informix is concerned, it's a regular table scan 
because the use of our indexes is hidden.




Regards,
--
Shigeru Hanada



Cheers,
Eric.


**
Eric Davies, M.Sc.
Senior Programmer Analyst
Barrodale Computing Services Ltd.
1095 McKenzie Ave., Suite 418
Victoria BC V8P 2L5
Canada

Tel: (250) 704-4428
Web: http://www.barrodale.com
Email: e...@barrodale.com
**




Re: [HACKERS] SQL/MED estimated time of arrival?

2010-11-15 Thread Eric Davies
With Informix VTI, indexing is the same for native tables as for 
virtual tables, except the interpretation of the 32 bit rowid is left 
up to the developer. When you define the VTI class, you optionally 
supply a method that can fetch data based on a 32 bit rowid, and it's 
the responsibility of your non-indexed scanning methods to provide 
rowids along with the row tuple.


Having local indexes can be very useful if you have a user that 
issues queries like:

   select count(*) from some_external_table where  ;
With VTI, the count aggregate doesn't get pushed down, meaning that 
without a local index, your scanning method has to return as many 
tuples as match the where clause, which can be very slow.


Local indexes also affords the opportunity of using specialized 
indexes built into the database. My guess is that without some form 
of rowids being passed back and forth, you couldn't define 
non-materialized views of virtual tables that could be indexed.


That said, we implemented our own btree-like index that used the 
pushed down predicates because fetching data one row at a time wasn't 
desirable with our design choices, and we wanted to support virtual 
tables with more than 4 billion rows.


Eric
At 07:41 PM 11/14/2010, Shigeru HANADA wrote:

On Fri, 12 Nov 2010 08:27:54 -0800
Eric Davies e...@barrodale.com wrote:
 Thank you for the time estimate and the interface discussion. It
 sounds like the PostgreSQL SQL/MED code will be very useful when it
 is done. Our product provides read-only access to files, so
 updates/inserts/deletes aren't an issue for us.

 One thing that is not clear to me is indexing support. Will it be
 possible to index a SQL/MED table as if it were a regular table?

No, SQL/MED would not support indexing foreign tables, at least in
first version.  Because it would be difficult to use common row id for
various FDWs.  To support indexing foreign tables might need to change
common structure of index tuple to be able to hold virtual row-id, not
ItemPointerData.

Instead, FDW can handle expressions which are parsed from WHERE clause
and JOIN condition of original SQL, and use them to optimize scanning.
For example, FDW for PostgreSQL pushes some conditions down to remote
side to decrease result tuples to be transferred.  I hope this idea
helps you.

  What
 would be the equivalent of Informix's row ids?

Answer to the second question would be ItemPointerData.  It consists
of a block number and an offset in the block, and consume 6 bytes for
each tuple.  With this information, PostgreSQL can access to a data
tuple directly.  Actual definition is:

typedef struct ItemPointerData
{
BlockIdData ip_blkid;
OffsetNumber ip_posid;
} ItemPointer;

Does Informix uses common row-id (AFAIK it's 4 bytes integer) for
both of virtual tables and normal tables?

Regards,
--
Shigeru Hanada


**
Eric Davies, M.Sc.
Senior Programmer Analyst
Barrodale Computing Services Ltd.
1095 McKenzie Ave., Suite 418
Victoria BC V8P 2L5
Canada

Tel: (250) 704-4428
Web: http://www.barrodale.com
Email: e...@barrodale.com
**




Re: [HACKERS] SQL/MED estimated time of arrival?

2010-11-12 Thread Eric Davies

Hi Gentlemen,

Thank you for the time estimate and the interface discussion. It 
sounds like the PostgreSQL SQL/MED code will be very useful when it 
is done. Our product provides read-only access to files, so 
updates/inserts/deletes aren't an issue for us.


One thing that is not clear to me is indexing support. Will it be 
possible to index a SQL/MED table as if it were a regular table? What 
would be the equivalent of Informix's row ids?


Eric.



**
Eric Davies, M.Sc.
Senior Programmer Analyst
Barrodale Computing Services Ltd.
1095 McKenzie Ave., Suite 418
Victoria BC V8P 2L5
Canada

Tel: (250) 704-4428
Web: http://www.barrodale.com
Email: e...@barrodale.com
**




[HACKERS] SQL/MED estimated time of arrival?

2010-11-03 Thread Eric Davies

Hi SQL/MED developers,

Our company has just finished development of a database extension for 
Informix that provides tabular access to various types of structured 
files (NetCDF and HDF5, with more types to come). We would like to 
port this logic to run on PostgreSQL, since many of our potential 
customers use PostgreSQL.


On Informix, we were able to take advantage of the VTI (Virtual Table 
Interface) feature to support table scans and indexing. (See 
http://www.ibm.com/developerworks/data/zones/informix/library/techarticle/db_vti.html 
.) Do you have any idea of how long it will be before SQL/MED on 
PostgreSQL will be available, and perhaps how similar it will be to 
Informix VTI?


Thanks,
Eric.

**
Eric Davies, M.Sc.
Senior Programmer Analyst
Barrodale Computing Services Ltd.
1095 McKenzie Ave., Suite 418
Victoria BC V8P 2L5
Canada

Tel: (250) 704-4428
Web: http://www.barrodale.com
Email: e...@barrodale.com
**




[HACKERS] mechanism used to store images in Postgres

2003-11-28 Thread Eric Davies
We're looking at storing the equivalent of very large rasters images
in
postgres. Our understanding is that postgres stores large objects
as little pieces in rows of a table. As our images would be 30MB-4GB
in size, this would be a lot of rows. Has anybody used a different
mechanism to store their data that offers higher performance?

Thank you,
Eric.

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])