[
https://issues.apache.org/jira/browse/PHOENIX-5435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16904271#comment-16904271
]
Andrew Purtell commented on PHOENIX-5435:
-----------------------------------------
bq. It should have a feature toggle so operators who don't need it don't bear
the slight extra storage cost.
Not arguing with the idea. Slight is probably the right characterization, will
be interesting to see how that bears out. I've found the HBase WAL is already
really inefficient as a storage format - that's not its intent - but that can
be mitigated with WAL compression if you turn it on. For example if I Avro
encode a 100 column row (where the data in cells is 8 byte longs, randomly set)
the per record encoding is 100x more efficient in Avro form than the WAL. That
would drop considerably with WAL compression enabled although I did not measure
it.
> Annotate HBase WALs with Phoenix Metadata
> -----------------------------------------
>
> Key: PHOENIX-5435
> URL: https://issues.apache.org/jira/browse/PHOENIX-5435
> Project: Phoenix
> Issue Type: New Feature
> Reporter: Geoffrey Jacoby
> Assignee: Geoffrey Jacoby
> Priority: Major
>
> HBase write-ahead-logs (WALs) drive not only failure recovery, but HBase
> replication and some HBase backup frameworks. The WALs contain HBase-level
> metadata such as table and region, but lack Phoenix-level metadata. That
> means that it's quite difficult to build correct logic that needs to know
> about Phoenix-level constructs such as multi-tenancy, views, or indexes.
> HBASE-22622 and HBASE-22623 add the capacity for coprocessors to annotate
> extra key/value pairs of metadata into the HBase WAL. We should have the
> option to annotate the tuple <tenant_id, table-or-view-name, timestamp>, or
> some hashed way to reconstruct that tuple into the WAL. It should have a
> feature toggle so operators who don't need it don't bear the slight extra
> storage cost.
--
This message was sent by Atlassian JIRA
(v7.6.14#76016)