Good points. Word combination is what i was trying to say. Say i have a word (lemma), and need words before & after queryable by lemma. lets call this a sentence. So my rowkey will essentially be a sentence (with doc id). But i can have identical rowkeys still within a document. Maybe i'm missing something... hmm...
Travis Hegner wrote: > > What about a document-id, word-position, and word combination. With the > proper combo all words in a single document would be located near > each-other. > > Travis Hegner > http://www.travishegner.com/ > > > -----Original Message----- > From: llpind [mailto:[email protected]] > Sent: Monday, August 24, 2009 1:37 PM > To: [email protected] > Subject: HBase data model question > > > Hey, > > I'm trying to move a relational model to HBase, and would like some input. > > Suppose i have constant stream of documents coming in, and I'd like to > parse > these by a single word. > > It makes sense to have this word as my rowkey, but I need a way to handle > duplicate word text. Kind of a dicitionary > > What is the best way to solve this in HBase? timestamp in row key? Since > I > need a way to identify each word uniquely > > > Thanks. > -- > View this message in context: > http://www.nabble.com/HBase-data-model-question-tp25120285p25120285.html > Sent from the HBase User mailing list archive at Nabble.com. > > > The information contained in this communication is confidential and is > intended only for the use of the named recipient. Unauthorized use, > disclosure, or copying is strictly prohibited and may be unlawful. If you > have received this communication in error, you should know that you are > bound to confidentiality, and should please immediately notify the sender > or our IT Department at 866.459.4599. > > -- View this message in context: http://www.nabble.com/HBase-data-model-question-tp25120285p25121358.html Sent from the HBase User mailing list archive at Nabble.com.
