[
https://issues.apache.org/jira/browse/PHOENIX-4701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16448519#comment-16448519
]
James Taylor edited comment on PHOENIX-4701 at 4/23/18 5:38 PM:
----------------------------------------------------------------
Also, I don't think a PK of only QUERY_ID is particularly useful. It should be
part of the PK (for uniqueness), but at the end as we won't every query by
QUERY_ID). What would be the most common query against the table? That'll drive
what the PK should be. Perhaps a PK of (START_TIME, TOTAL_EXECUTION_TIME,
QUERY_ID). We'd want to salt the table as well to prevent write hotspotting.
This would let us efficiently query for queries that occurred within a given
time range that were slow. For example:
{code:java}
SELECT * FROM SYSTEM.LOG WHERE START_TIME > CURRENT_DATE()-1.0/24.0 AND
START_TIME < CURRENT_DATE() AND TOTAL_EXECUTION_TIME > 1000;{code}
Without a better PK, the above would be full table scan.
Also, START_TIME should be declared as a DATE, not a TIMESTAMP. TIMESTAMP is
nanosecond granularity which we don't need (and wouldn't capture anyway) and
it'd cause more overhead. DATE is millisecond granularity which is what we'd
want.
was (Author: jamestaylor):
Also, I don't think a PK of only QUERY_ID is particularly useful. It should be
part of the PK (for uniqueness), but at the end as we won't every query by
QUERY_ID). What would be the most common query against the table? That'll drive
what the PK should be. Perhaps a PK of (TOTAL_EXECUTION_TIME, START_TIME,
QUERY_ID).
Also, START_TIME should be declared as a DATE, not a TIMESTAMP. TIMESTAMP is
nanosecond granularity which we don't need (and wouldn't capture anyway) and
it'd cause more overhead. DATE is millisecond granularity which is what we'd
want.
> Declare SYSTEM.LOG table as immutable with compact storage format
> -----------------------------------------------------------------
>
> Key: PHOENIX-4701
> URL: https://issues.apache.org/jira/browse/PHOENIX-4701
> Project: Phoenix
> Issue Type: Bug
> Reporter: James Taylor
> Assignee: James Taylor
> Priority: Major
> Fix For: 4.14.0, 5.0.0
>
>
> If possible, the SYSTEM.LOG table would benefit greatly (3-5x perf gain)
> from being declared as immutable with a column encoding of 1 byte and a
> storage format of SINGLE_CELL_ARRAY_WITH_OFFSETS.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)