To add clarity to the questions we do use the scan with the reverse flag to be true (so scan in reverse) it is a range scan and not the entire table we do not do deletes, only on occasion overwrites will try a major compaction to see if there is a performance difference Will also have to scan on hbase cli to see if there is a difference
On Fri, Dec 11, 2015 at 9:25 AM, Stack <[email protected]> wrote: > On Wed, Dec 9, 2015 at 8:14 PM, Abraham Tom <[email protected]> wrote: > >> Hi all >> does anybody have scripts or examples to do performance tests and >> benchmarks on scans >> We are using cloudera 2.5.5 hbase 1.0 >> and most interaction is via Node.js through Thrift. >> we have a table thats currently about 20 million rows >> key is code|account|2ndCode|2ndAccount|datetime| >> unique_id >> >> we do a lot of scans on datetime in reverse >> > > > Date is stored in reverse order or you are scanning in reverse? > > > >> The scan use to take less than a second when the table was under 5 >> million, >> but lately its been about 10 sec now that we are about 20 million rows >> >> > You are scanning whole table or just a range? Has how much you are > scanning changed? Any filters? Do you do deletes? Does major compacting > change the performance you are seeing? > > > >> Native API out of the question due to Node.js, so it has to be thrift >> >> no column qualifiers either >> >> but wanted to test native before thrift. Any github repos or ideas would >> be helpful >> >> >> > You could test with YCSB or with the bundled PerformanceEvaluation, both > have 'scan' tests, but it seems like something particular to how you are > querying that you'd like to fix. > > St.Ack > > > >> -- >> Abraham Tom >> Email: [email protected] <[email protected]> >> > > -- Abraham Tom Email: [email protected] Phone: 415-515-3621
