Hi Dave,
  Our example is simpler than the wiki-search one, and I had forgotten exactly 
how wiki-search is structured: We're using a simple, single-table layout, 
everything sharded. We also have some other stuff in the column family, but 
simplified, it's just:

row: ShardID
colFam: Term
colQual: Document

And then searches will use an iterator extending IntersectingIterator to find 
results matching specified terms etc.

Thanks,
Krishmin

On Oct 30, 2012, at 3:43 PM, [email protected] wrote:

> Krishmin,
> 
>   In the wikisearch example there is a non-sharded index table and a sharded 
> document table. The index table is used to reduce the number of tablets that 
> need to be searched for a given set of terms. Is your setup similar? I'm a 
> little confused since you mention using a sharded index table and that all 
> queries will have an infinite range.
> 
> Dave Marion
> 
> From: "Krishmin Rai" <[email protected]>
> To: [email protected]
> Sent: Tuesday, October 30, 2012 3:28:15 PM
> Subject: Re: Number of partitions for sharded table
> 
> I should clarify that I've been pre-splitting tables at each shard so that 
> each tablet consists of a single row.
> 
> On Oct 30, 2012, at 3:06 PM, Krishmin Rai wrote:
> 
> > Hi All, 
> >  We're working with an index table whose row is a shardId (an integer, like 
> > the wiki-search or IndexedDoc examples). I was just wondering what the 
> > right strategy is for choosing a number of partitions, particularly given a 
> > cluster that could potentially grow.
> > 
> >  If I simply set the number of shards equal to the number of slave nodes, 
> > additional nodes would not improve query performance (at least over the 
> > data already ingested). But starting with more partitions than slave nodes 
> > would result in multiple tablets per tablet server… I'm not really sure how 
> > that would impact performance, particularly given that all queries against 
> > the table will be batchscanners with an infinite range.
> > 
> >  Just wondering how others have addressed this problem, and if there are 
> > any performance rules of thumb regarding the ratio of tablets to tablet 
> > servers.
> > 
> > Thanks!
> > Krishmin
> 

Reply via email to