Hi Hannu,

On 10/14/07 12:58 AM, "Hannu Krosing" <[EMAIL PROTECTED]> wrote:

> What has happened in reality, is that the speed difference between CPU,
> RAM and disk speeds has _increased_ tremendously

Yes.

> which makes it even
> more important to _decrease_ the size of stored data if you want good
> performance

Or bring the cpu processing closer to the data it's using (or both).

By default, the trend you mention first will continue in an unending way -
the consequence is that the "distance" between a processor and it's target
data will continue to increase ad-infinitum.

By contrast, you can only decrease the data volume so much - so in the end
you'll be left with the same problem - the data needs to be closer to the
processing.  This is the essence of parallel / shared nothing architecture.

Note that we've done this at Greenplum.  We're also implementing a DSM-like
capability and are investigating a couple of different hybrid row / column
store approaches.

Bitmap index with index-only access does provide nearly all of the
advantages of a column store from a speed standpoint BTW.  Even though
Vertica is touting speed advantages - our parallel engine plus bitmap index
will crush them in benchmarks when they show up with real code.

Meanwhile they're moving on to new ideas - I kid you not "Horizontica" is
Dr. Stonebraker's new idea :-)

So - bottom line - some ideas from column store make sense, but it's not a
cure-all.
 
> There is also a MonetDB/X100 project, which tries to make MonetOD
> order(s) of magnitude faster by doing in-page compression in order to
> get even more performance, see:

Actually, the majority of the points made by the MonetDB team involve
decreasing the abstractions in the processing path to improve the IPC
(instructions per clock) efficiency of the executor.

We are also planning to do this by operating on data in vectors of projected
rows in the executor, which will increase the IPC by reducing I-cache misses
and improving D-cache locality.  Tight loops will make a much bigger
difference when long runs of data are the target operands.

- Luke 



---------------------------(end of broadcast)---------------------------
TIP 7: You can help support the PostgreSQL project by donating at

                http://www.postgresql.org/about/donate

Reply via email to