What about a document-id, word-position, and word combination. With the proper 
combo all words in a single document would be located near each-other.

Travis Hegner
http://www.travishegner.com/


-----Original Message-----
From: llpind [mailto:[email protected]]
Sent: Monday, August 24, 2009 1:37 PM
To: [email protected]
Subject: HBase data model question


Hey,

I'm trying to move a relational model to HBase, and would like some input.

Suppose i have constant stream of documents coming in, and I'd like to parse
these by a single word.

It makes sense to have this word as my rowkey, but I need a way to handle
duplicate word text.  Kind of a dicitionary

What is the best way to solve this in HBase?  timestamp in row key?  Since I
need a way to identify each word uniquely


Thanks.
--
View this message in context: 
http://www.nabble.com/HBase-data-model-question-tp25120285p25120285.html
Sent from the HBase User mailing list archive at Nabble.com.


The information contained in this communication is confidential and is intended 
only for the use of the named recipient.  Unauthorized use, disclosure, or 
copying is strictly prohibited and may be unlawful.  If you have received this 
communication in error, you should know that you are bound to confidentiality, 
and should please immediately notify the sender or our IT Department at  
866.459.4599.

Reply via email to