Re: hash function per table

Oleg Ruchovets Sun, 20 Mar 2011 13:27:10 -0700

Can you share more information about your tests?



   I still have  couple of issues that I don't understand :
    1)     public Scan setTimeRange(long minStamp, long maxStamp) vs
startKey , endKey approach , what is the better approach and does  one  has
significant time execution difference compare to another.
    2)     Suppose I am inserting data I try to distribute it across the
regions and I will create index at the same time. Will  Index help me to
improve the scan process?




On Sun, Mar 20, 2011 at 10:03 PM, Pete Haidinyak <[email protected]> wrote:

> I went through this discussion a month or so ago and came away with the
> opinion that you can either have an efficient load with random key but then
> have an inefficient 'scan' not using start and end rows, or have an
> inefficient import with sequential key and then scan using start and end
> rows.
>
> -Pete
>
>
>
> On Sun, 20 Mar 2011 12:52:24 -0700, Oleg Ruchovets <[email protected]>
> wrote:
>
>  Actually discussion started from this post:
>>
>>
>>
>> http://search-hadoop.com/m/XX3nW68JsY1/hbase+insertion+optimisation&subj=hbase+insertion+optimisation+
>>
>> Simply inserting the data in which row key <date>_<somedata> I noticed
>> that
>> only one node works (region to which data were writing). In case we have
>> 10-15 nodes I think it is inefficient to write data to only one region. I
>> want to get an effect that data will be inserted to  as much as possible
>> nodes  simultaneously. Correct me guys ,  but in this case  writing job
>> will take less time , am I write?
>>
>> Oleg.
>>
>> On Sun, Mar 20, 2011 at 8:57 PM, Chris Tarnas <[email protected]> wrote:
>>
>>  There is none - HBase uses a total order partitioner. The straight key
>>> value itself determines which region a row is put into. This allows for
>>> very
>>> rapid scans of sequential data, among other things but does mean it is
>>> easier to hotspot regions. Key design is very important.
>>>
>>> -chris
>>>
>>> On Mar 20, 2011, at 11:41 AM, Lior Schachter wrote:
>>>
>>> > the hash function that distributes the rows between the regions.
>>> >
>>> > On Sun, Mar 20, 2011 at 8:36 PM, Stack <[email protected]> wrote:
>>> >
>>> >> Hash?  Which hash are you referring to sir?
>>> >> St.Ack
>>> >>
>>> >> On Sun, Mar 20, 2011 at 10:06 AM, Lior Schachter <[email protected]
>>> >
>>> >> wrote:
>>> >>> Hi,
>>> >>> What is the API or configuration for changing the default hash
>>> function
>>> >> for
>>> >>> a specific htable.
>>> >>>
>>> >>> thanks,
>>> >>> Lior
>>> >>>
>>> >>
>>>
>>>
>

Re: hash function per table

Reply via email to