On 12/29/16 9:55 PM, Haribabu Kommi wrote:
The tuples which don't have multiple copies or frozen data will be moved
from WOS to ROS periodically by the background worker process or autovauum
process. Every column data is stored separately in it's relation file. There
is no transaction information is present in ROS. The data in ROS can be
referred with tuple ID.

Would updates be handled via the delete mechanism you described then?

In this approach, the column data is present in both heap and columnar
storage.

ISTM one of the biggest reasons to prefer a column store over heap is to ditch the 24 byte overhead, so I'm not sure how much of a win this is.

Another complication is that one of the big advantages of a CSTORE is allowing analysis to be done efficiently on a column-by-column (as opposed to row-by-row) basis. Does your patch by chance provide that?

Generally speaking, I do think the idea of adding support for this as an "index" is a really good starting point, since that part of the system is pluggable. It might be better to target getting only what needs to be in core into core to begin with, allowing the other code to remain an extension for now. I think there's a lot of things that will be discovered as we start moving into column stores, and it'd be very unfortunate to accidentally paint the core code into a corner somewhere.

As a side note, it's possible to get a lot of the benefits of a column store by using arrays. I've done some experiments with that and got an 80-90% space reduction, and most queries saw improved performance as well (there were a few cases that weren't better). The biggest advantage to this approach is people could start using it today, on any recent version of Postgres. That would be a great way to gain knowledge on what users would want to see in a column store, something else I suspect we need. It would also be far less code than what you or Alvaro are proposing. When it comes to large changes that don't have crystal-clear requirements, I think that's really important.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to