Re: Modeling column families

Andrew Nguyen Fri, 04 Jun 2010 09:20:12 -0700

Ryan,

I went ahead and began modeling our data as you have suggested below.  However, 
we just realized something with our compound key.  We don't actually have 
access to the patient identifier at the level of the data collection that is 
being performed.  What we do know is the bed #.  We have a predetermined number 
of beds so I was thinking if there were better ways to model everything given 
this finite (and predetermined) set for the compound keys.

Given this, would it be better to have a different table for each bed (and just 
have the row key be the time stamp)?  What are the downsides to having hundreds 
of different tables that have the same "schema" otherwise?

Thanks!

--Andrew

--
Andrew Nguyen
[email protected]

The information contained in this electronic message and any attachments to 
this message are intended for the exclusive use of the addressee(s) and may 
contain confidential or privileged information.  Any unauthorized review, 
dissemination, distribution, or copying of this communication is prohibited.  
If you are not the intended recipient, please notify the sender immediately by 
reply e-mail, and destroy all copies of this message and any attachments from 
your files.

On Apr 24, 2010, at 1:45 PM, Ryan Rawson wrote:

> Hey,
> 
> So in my case, timestamp wasnt unique, so I had to put in event id.
> For timeseries systems, you of course wouldnt need to have an
> additional id.  So your first thought where you have:
> <patient id><timestamp>
> 
> then putting physiologic parameters in different columns (But the same
> column family) sounds great to me.  This is a good example of where
> flexible schema is good, since you can store any number of parameters
> per row, but only the ones you want.
> 
> As for HBase and multi-datacenter, there is work underway by my
> colleague JD to write a replication system.  It's in the late stages
> and we are hoping to get it into advanced testing soon.  Practically
> speaking you dont want to split your HDFS and HBase cluster across a
> datacenter.

Re: Modeling column families

Reply via email to