Krishmin, In the wikisearch example there is a non-sharded index table and a sharded document table. The index table is used to reduce the number of tablets that need to be searched for a given set of terms. Is your setup similar? I'm a little confused since you mention using a sharded index table and that all queries will have an infinite range.
Dave Marion ----- Original Message ----- From: "Krishmin Rai" <[email protected]> To: [email protected] Sent: Tuesday, October 30, 2012 3:28:15 PM Subject: Re: Number of partitions for sharded table I should clarify that I've been pre-splitting tables at each shard so that each tablet consists of a single row. On Oct 30, 2012, at 3:06 PM, Krishmin Rai wrote: > Hi All, > We're working with an index table whose row is a shardId (an integer, like > the wiki-search or IndexedDoc examples). I was just wondering what the right > strategy is for choosing a number of partitions, particularly given a cluster > that could potentially grow. > > If I simply set the number of shards equal to the number of slave nodes, > additional nodes would not improve query performance (at least over the data > already ingested). But starting with more partitions than slave nodes would > result in multiple tablets per tablet server… I'm not really sure how that > would impact performance, particularly given that all queries against the > table will be batchscanners with an infinite range. > > Just wondering how others have addressed this problem, and if there are any > performance rules of thumb regarding the ratio of tablets to tablet servers. > > Thanks! > Krishmin
