Re: Zedstore - compressed in-core columnar storage

Heikki Linnakangas Thu, 29 Aug 2019 05:09:57 -0700

On 29/08/2019 14:30, Ashutosh Sharma wrote:

On Wed, Aug 28, 2019 at 5:30 AM Alexandra Wang <[email protected]<mailto:[email protected]>> wrote:
    You are correct that we currently go through each item in the leaf
    page that
    contains the given tid, specifically, the logic to retrieve all the
    attribute
    items inside a ZSAttStream is now moved to decode_attstream() in the
    latest
    code, and then in zsbt_attr_fetch() we again loop through each item we
    previously retrieved from decode_attstream() and look for the given
tid.
Okay. Any idea why this new way of storing attribute data as streams(lowerstream and upperstream) has been chosen just for the attributesbut not for tids. Are only attribute blocks compressed but not the tidsblocks?

Right, only attribute blocks are currently compressed. Tid blocks needto be modified when there are UPDATEs or DELETE, so I think having todecompress and recompress them would be more costly. Also, there is nouser data on the TID tree, and the Simple-8b encoded codewords used torepresent the TIDs are already pretty compact. I'm not sure how muchgain you would get from passing it through a general purpose compressor.

I could be wrong though. We could certainly try it out, and see how itperforms.

    One
    optimization we can to is to tell decode_attstream() to stop
    decoding at the
    tid we are interested in. We can also apply other tricks to speed up the
    lookups in the page, for fixed length attribute, it is easy to do
    binary search
    instead of linear search, and for variable length attribute, we can
    probably
try something that we didn't think of yet.
I think we can probably ask decode_attstream() to stop once it has foundthe tid that we are searching for but then we only need to do that forIndex Scans.

I've been thinking that we should add a few "bookmarks" on long streams,so that you could skip e.g. to the midpoint in a stream. It's a tradeoffthough; when you add more information for random access, it makes therepresentation less compact.

    Zedstore currently implement update as delete+insert, hence the old
    tid is not
    reused. We don't store the tuple in our UNDO log, and we only store the
    transaction information in the UNDO log. Reusing the tid of the old
    tuple means
    putting the old tuple in the UNDO log, which we have not implemented
    yet.
OKay, so that means performing update on a non-key attribute would alsorequire changes in the index table. In short, HOT update is currentlynot possible with zedstore table. Am I right?

That's right. There's a lot of potential gain for doing HOT updates. Forexample, if you UPDATE one column on every row on a table, ideally youwould only modify the attribute tree containing that column. But thathasn't been implemented.


- Heikki

Re: Zedstore - compressed in-core columnar storage

Reply via email to