[ 
https://issues.apache.org/jira/browse/PHOENIX-2565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15797442#comment-15797442
 ] 

Samarth Jain commented on PHOENIX-2565:
---------------------------------------

Another possible option would be to separate out the offset information in key 
values of their own instead of including it in the packed cell itself. In fact, 
instead of storing the offset information, we could then just store the 
starting position and length of the column bytes. We can use our number based 
qualifiers for storing this start-length info. This helps us get away with 
storing the metadata information only for those columns whose values have been 
packed in our "data" cell.

Let me try to explain with an example:

Consider the data cell that looks something like this:
single_cell_column_qualifier :: col1\col20\col60\col100 where \ represents a 
separator byte.

Then, the metadata cells with this new encoding scheme would look something 
like this:
cq  :: start position/length (value)
1 :: 0/10
20 :: 11/20
60 :: 31/10
100 :: 41/20

So now for fetching the value of our col1 whose qualifier also is 1, we would 
first get the start position and length information by fetching value of column 
qualifier 1. Then using the start position/length information, we could easily 
extract the bytes out of the packed data cell.

> Store data for immutable tables in single KeyValue
> --------------------------------------------------
>
>                 Key: PHOENIX-2565
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-2565
>             Project: Phoenix
>          Issue Type: Improvement
>            Reporter: James Taylor
>            Assignee: Thomas D'Silva
>         Attachments: PHOENIX-2565-v2.patch, PHOENIX-2565-wip.patch, 
> PHOENIX-2565.patch
>
>
> Since an immutable table (i.e. declared with IMMUTABLE_ROWS=true) will never 
> update a column value, it'd be more efficient to store all column values for 
> a row in a single KeyValue. We could use the existing format we have for 
> variable length arrays.
> For backward compatibility, we'd need to support the current mechanism. Also, 
> you'd no longer be allowed to transition an existing table to/from being 
> immutable. I think the best approach would be to introduce a new IMMUTABLE 
> keyword and use it like this:
> {code}
> CREATE IMMUTABLE TABLE ...
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to