[
https://issues.apache.org/jira/browse/PHOENIX-3718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Samarth Jain updated PHOENIX-3718:
----------------------------------
Attachment: columnencoding.md
[~pconrad] - saw your comments late. I ended up publishing the doc but please
feel free to edit the md as you seem fit.
bq. I don't see a link.
- added in the updated .md file
bq. I get the part about planning the number of columns ahead of time, but I
don't know what "its view hierarchy to have in the lifecycle" means.
- Phoenix tables can have views declared on them. These views can add their own
columns either when creating the view through CREATE VIEW or by adding columns
later through ALTER TABLE add column. One can have views over the views so it
really becomes like a hierarchy with the base table as the root. So the limit
applies to all the columns in the base table as well as views that are declared
on it.
bq. It seems like the column mapping property is meant to limit the number of
columns, since if you don't set it the columns are theoretically unlimited. Is
that right?
No, the focus of column mapping was to build a layer of indirection between
phoenix column name and hbase column qualifier. Earlier, the phoenix column
names would end up being the hbase column qualifiers too. It also provides
various performance benefits which I have mentioned in the blog.
bq. It sounds like phoenix.default.column.encoded.bytes.attrib controls the
default column mapping property for all tables, and COLUMN_ENCODED_BYTES
controls the column mapping property for one table as you create it. Is that
the case?
Yes, the config is a global config. So every new phoenix table created will be
using that config for column mapping property. If users want to use something
else, then they can use the COLUMN_ENCODED_BYTES in the CREATE TABLE statement.
bq. Does it make sense to give a sentence declaring what "single cell array
with offsets" and "one cell per column" mean and how they differ? Is there any
effect on sharding for one versus the other?
Sure, we can add that. Single cell array with offsets packs all phoenix columns
belonging to a column family in a single HBase cell. One cell per column
encoding means that there is one cell per phoenix column. No effect on sharding
as that is determined by the row key and this encoding is for the key value
columns.
> Provide user-level documentation for column encoding feature
> ------------------------------------------------------------
>
> Key: PHOENIX-3718
> URL: https://issues.apache.org/jira/browse/PHOENIX-3718
> Project: Phoenix
> Issue Type: Bug
> Reporter: James Taylor
> Assignee: Samarth Jain
> Attachments: columnencoding.md
>
>
> Let's create a new page on our site describing column encoding. I think we
> should keep it at a very high level, but document when the feature should and
> should not be used.
> A blog with more technical depth on the implementation might be interesting,
> though. You up for that [~samarthjain]?
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)