Actually discussion started from this post:
http://search-hadoop.com/m/XX3nW68JsY1/hbase+insertion+optimisation&subj=hbase+insertion+optimisation+ Simply inserting the data in which row key <date>_<somedata> I noticed that only one node works (region to which data were writing). In case we have 10-15 nodes I think it is inefficient to write data to only one region. I want to get an effect that data will be inserted to as much as possible nodes simultaneously. Correct me guys , but in this case writing job will take less time , am I write? Oleg. On Sun, Mar 20, 2011 at 8:57 PM, Chris Tarnas <[email protected]> wrote: > There is none - HBase uses a total order partitioner. The straight key > value itself determines which region a row is put into. This allows for very > rapid scans of sequential data, among other things but does mean it is > easier to hotspot regions. Key design is very important. > > -chris > > On Mar 20, 2011, at 11:41 AM, Lior Schachter wrote: > > > the hash function that distributes the rows between the regions. > > > > On Sun, Mar 20, 2011 at 8:36 PM, Stack <[email protected]> wrote: > > > >> Hash? Which hash are you referring to sir? > >> St.Ack > >> > >> On Sun, Mar 20, 2011 at 10:06 AM, Lior Schachter <[email protected]> > >> wrote: > >>> Hi, > >>> What is the API or configuration for changing the default hash function > >> for > >>> a specific htable. > >>> > >>> thanks, > >>> Lior > >>> > >> > >
