[ 
https://issues.apache.org/jira/browse/PHOENIX-3718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Samarth Jain updated PHOENIX-3718:
----------------------------------
    Attachment: columnencoding.md

[~pconrad] - saw your comments late. I ended up publishing the doc but please 
feel free to edit the md as you seem fit.

bq. I don't see a link.
- added in the updated .md file

bq. I get the part about planning the number of columns ahead of time, but I 
don't know what "its view hierarchy to have in the lifecycle" means.
- Phoenix tables can have views declared on them. These views can add their own 
columns either when creating the view through CREATE VIEW or by adding columns 
later through ALTER TABLE add column. One can have views over the views so it 
really becomes like a hierarchy with the base table as the root. So the limit 
applies to all the columns in the base table as well as views that are declared 
on it.

bq. It seems like the column mapping property is meant to limit the number of 
columns, since if you don't set it the columns are theoretically unlimited. Is 
that right?
No, the focus of column mapping was to build a layer of indirection between 
phoenix column name and hbase column qualifier.  Earlier, the phoenix column 
names would end up being the hbase column qualifiers too. It also provides 
various performance benefits which I have mentioned in the blog.

bq. It sounds like phoenix.default.column.encoded.bytes.attrib controls the 
default column mapping property for all tables, and COLUMN_ENCODED_BYTES 
controls the column mapping property for one table as you create it. Is that 
the case?

Yes, the config is a global config. So every new phoenix table created will be 
using that config for column mapping property. If users want to use something 
else, then they can use the COLUMN_ENCODED_BYTES in the CREATE TABLE statement.

bq. Does it make sense to give a sentence declaring what "single cell array 
with offsets" and "one cell per column" mean and how they differ? Is there any 
effect on sharding for one versus the other?

Sure, we can add that. Single cell array with offsets packs all phoenix columns 
belonging to a column family in a single HBase cell. One cell per column 
encoding means that there is one cell per phoenix column. No effect on sharding 
as that is determined by the row key and this encoding is for the key value 
columns.










> Provide user-level documentation for column encoding feature
> ------------------------------------------------------------
>
>                 Key: PHOENIX-3718
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-3718
>             Project: Phoenix
>          Issue Type: Bug
>            Reporter: James Taylor
>            Assignee: Samarth Jain
>         Attachments: columnencoding.md
>
>
> Let's create a new page on our site describing column encoding. I think we 
> should keep it at a very high level, but document when the feature should and 
> should not be used.
> A blog with more technical depth on the implementation might be interesting, 
> though. You up for that [~samarthjain]?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to