>Didnt follow u completely here. There wont be any get() happening.. As the >exact rowkey in a region we get from the index table, we can seek to the >exact position and return that row.
Sorry, When I misused "get()" here, I meant seeking. Yes, if it's just small number of rows returned, this works perfect. As you said you will get the exact rowkey positions per region, and simply seek them. I was trying to work out the case that when the number of result rows increases massively. Like in Anil's case, he wants to do a scan query against the 2ndary index(timestamp): "select all rows from timestamp1 to timestamp2" given no customerId provided. During that time period, he might have a big chunk of rows from different customerIds. The index table returns a lot of rowkey positions for different customerIds (I believe they are scattered in different regions), then you end up seeking all different positions in different regions and return all the rows needed. According to your presentation page14 - Performance Test Results (Scan), without index, it's a linear increase as result rows # increases. on the other hand, with index, time spent climbs up way quicker than the case without index. btw, quick question- in your presentation, the scale there is seconds or mill-seconds:) - Shengjie On 27 December 2012 15:54, Anoop John <[email protected]> wrote: > >how the massive number of get() is going to > perform againt the main table > > Didnt follow u completely here. There wont be any get() happening.. As the > exact rowkey in a region we get from the index table, we can seek to the > exact position and return that row. > > -Anoop- > > On Thu, Dec 27, 2012 at 6:37 PM, Shengjie Min <[email protected]> > wrote: > > > how the massive number of get() is going to > > perform againt the main table > > > -- All the best, Shengjie Min
