To add clarity to the questions
we do use the scan with the reverse flag to be true (so scan in reverse)
it is a range scan and not the entire table
we do not do deletes, only on occasion overwrites
will try a major compaction to see  if there is a performance difference
Will also have to scan on hbase cli to see if there is a difference



On Fri, Dec 11, 2015 at 9:25 AM, Stack <[email protected]> wrote:

> On Wed, Dec 9, 2015 at 8:14 PM, Abraham Tom <[email protected]> wrote:
>
>> Hi all
>> does anybody have scripts or examples to do performance tests and
>> benchmarks on scans
>> We are using cloudera 2.5.5 hbase 1.0
>> and most interaction is via Node.js through Thrift.
>> we have a table thats currently about 20 million rows
>> key is code|account|2ndCode|2ndAccount|datetime|
>> unique_id
>>
>> we do a lot of scans on datetime in reverse
>>
>
>
> Date is stored in reverse order or you are scanning in reverse?
>
>
>
>> The scan use to take less than a second when the table was under 5
>> million,
>> but lately its been about 10 sec now that we are about 20 million rows
>>
>>
> You are scanning whole table or just a range? Has how much you are
> scanning changed? Any filters? Do you do deletes? Does major compacting
> change the performance you are seeing?
>
>
>
>> Native API out of the question due to Node.js, so it has to be thrift
>>
>> no column qualifiers either
>>
>> but wanted to test native before thrift.  Any github repos or ideas would
>> be helpful
>>
>>
>>
> You could test with YCSB or with the bundled PerformanceEvaluation, both
> have 'scan' tests, but it seems like something particular to how you are
> querying that you'd like to fix.
>
> St.Ack
>
>
>
>> --
>> Abraham Tom
>> Email:   [email protected]  <[email protected]>
>>
>
>


-- 
Abraham Tom
Email:   [email protected]
Phone:  415-515-3621

Reply via email to