Hi Ted,
Tks for pointing that HBASE-3488 is about CellCounter.
This will bring better visibility on the stored cells.
Vishal initial question was about having numerous version for a same
rowid/key.
I know datamodel design depends on usecase, but on a technical
point-of-vue (read/write performance...) is there anything against
having numerous (thousands, millions,...) versions of a same key ?
Tks,
- Eric
On 4/04/2011 18:38, Ted Yu wrote:
For 2, HBASE-3488 is for Cell Counter.
In Vishal's case, 3 years of data is stored for given row key. Issuing 'get'
command would not help much.
TIMERANGE support has been added in HBASE-3729
Cheers
On Sun, Apr 3, 2011 at 11:40 PM, Eric Charles<[email protected]>wrote:
1.- On my side, I could imagine to use the versions to store the history of
a key (without the need to add extra index table). Really depends on
requirement and datamodel, I think but many versions can sometimes make
sense.
2.- HBASE-3488 is related to the hadoop rowcounter job. To get versions by
code, you can use the setVersion/setMaxVersion/setTimeRange methods of the
Get and Scan objects. Via the shell, you can use "get 't1', 'r1', {COLUMN
=> 'c1', TIMESTAMP => ts1, VERSIONS => 4}" (not sure oif it's possible with
TIMERANGE vi the shell?)
Tks,
- Eric
On 3/04/2011 22:12, Ted Yu wrote:
For 1, please give some background to justify the high number of versions.
For 2, take a look at HBASE-3488
On Sun, Apr 3, 2011 at 12:49 PM, Vishal Kapoor
<[email protected]>wrote:
two questions,
1) if I give number of versions for a family as 365*3 is it a bad
design? how many versions are a good practice? if I have two many
versions will that be a single seek when I get the row Id? if yes,
will it take longer to store data? pros and cons?
2) how do I get the number of versions actually stored in a cell ( not
the max versions it is configured to store)
thanks,
Vishal