Modeling Question: Repeating Groups and Time Series

Tim Stearn Thu, 24 Apr 2014 19:17:32 -0700

Hi All,

VERY new to Phoenix, so please forgive a newbie question;).


I just watched this introductory video:  
http://www.youtube.com/watch?v=YHsHdQ08trg.  From this, I understand that there 
is a 1:1 correspondence between Phoenix and HBase tables.  That being the case, 
I assume that the same modeling imperatives for HBase apply for HBase+Phoenix.  
Namely that:
Good key design is very important
You should try to denormalize when reasonable to avoid scans of multiple tables
For #2, this implies that rows will likely have repeating groups and time 
series in columns, using naming conventions to associate related columns.  For 
instance, our application will need to support storing networks of related 
entities.  We're planning on using the same adjacency list approach that 
TitanDB users, so that each entity would include a column family for 
relationships with related fields using a naming convention like:
     [column name, column value (relationship strength)]:  <relationship 
type>_<related entity id>:  100

We will probably also have time series within a row.  So I might have a set of 
related metrics, keyed by the time period and metric name:
    <Year+Month>_<metric name>

     201401_average_balance, 201402_average_balance, 201401_max_balance, 
201402_max_balance, etc


My questions are these:

1.       Is this denormalization approach still the recommended approach when 
using HBase with Phoenix?

2.       Is there any special support in Phoenix to extract the "data" 
contained in column names (like the <related entity id>) above, or is this up 
to the application processing the query result?

3.       Is there any special support for querying these repeating groups of 
fields (such that Phoenix could be aware of the relationships between these 
fields) or is it:

a.       Up to the user to know which fields they're after and specify the 
proper names

b.      Up to the application to internally keep track of these fields and 
their relationships and generate the proper queries on the user's behalf?

Thanks in advance,

Tim S

Modeling Question: Repeating Groups and Time Series

Reply via email to