virajjasani opened a new pull request, #2225:
URL: https://github.com/apache/phoenix/pull/2225

   Jira: PHOENIX-7667
   
   Types of TTLs supported by Phoenix:
   
   1. Literal TTL: A simple numeric value specifying the TTL in seconds.
   e.g.
   `CREATE TABLE T1 (id VARCHAR PRIMARY KEY, COL1 VARCHAR, COL2 INTEGER) TTL = 
86400`
   
   Literal TTL values:
   
   - NONE / TTL_EXPRESSION_NOT_DEFINED: TTL is not specified for the table.
   - FOREVER / TTL_EXPRESSION_FOREVER: TTL is set to not expire for the rows.
   - TTL_EXPRESSION_DEFINED_IN_TABLE_DESCRIPTOR: Clients older than Phoenix 5.3 
sets TTL value to the HBase TableDescriptor.
   - User provided TTL: Literal value of TTL in seconds.
   
   
   2. Conditional TTL: A boolean expression to determine the row expiration 
based on column values.
   e.g.
   `CREATE TABLE T1 (id VARCHAR PRIMARY KEY, status VARCHAR) TTL = 'status = 
''EXPIRED'' OR TO_NUMBER(CURRENT_TIME()) - TO_NUMBER(PHOENIX_ROW_TIMESTAMP()) 
>= 108000000'`
   
   As of Phoenix 5.3.0 (client and server), both types of TTLs are stored in 
SYSTEM.CATALOG table.
   
   The default behavior of conditional TTL includes parsing and compilation of 
the TTL expression, serializing the compiled expression into bytes and sending 
the serialized bytes as the Scan attribute. The scan attribute for Conditional 
TTL is deserialized and used by many region coprocessors and scanner 
implementations including TTLRegionScanner, GlobalIndexRegionScanner and 
IndexRegionObserver.
   
   In order to provide strict TTL view for the users, the region observers 
perform extra computation on the row to determine whether the row has already 
expired and therefore should not be processed. For instance, 
IndexRegionObserver performs read for the given upsert to identify whether the 
row has already expired. While the extra cost incurred by the region observers 
help provide strict TTL expiration, some use case might not require strict TTL 
expiry.
   
   Let’s define the types of TTL use cases:
   
   - Strict TTL expiration: As soon as the TTL expires for the given row, the 
row must not be visible to the user queries.
   - Relaxed TTL expiration: After the TTL expiration, it can take several 
hours to days for the row to not be visible to the user queries.
   
   Costs involved to achieve strict Conditional TTL expiry:
   
   - TTLRegionScanner achieves masking of the row (Literal and Conditional TTL)
   - IndexRegionObserver performs read for each update operation (No additional 
cost when the table has covered index)
   - GlobalIndexRegionScanner evaluates TTL expression for each data table row 
during rebuild
   
   For users that do not care about the immediate row masking after the TTL 
expiry, we can provide optional configuration to avoid the extra cost 
associated with making the strict TTL expiration. The relaxed TTL expiration is 
expected to rely only on the Major compaction. After the major compaction 
successfully expires or deletes the given row, the client will no longer be 
able to read the expired row.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to