Re: How to do a Data sharding for data in a database table

Erick Erickson Thu, 18 Jun 2015 21:40:38 -0700

You've repeated your original statement. Shawn's
observation is that 10M docs is a very small corpus
by Solr standards. You either have very demanding
document/search combinations or you have a poorly
tuned Solr installation.

On reasonable hardware I expect 25-50M documents to have
sub-second response time.

So what we're trying to do is be sure this isn't
an "XY" problem, from Hossman's apache page:

Your question appears to be an "XY Problem" ... that is: you are dealing
with "X", you are assuming "Y" will help you, and you are asking about "Y"
without giving more details about the "X" so that we can understand the
full issue.  Perhaps the best solution doesn't involve "Y" at all?
See Also: http://www.perlmonks.org/index.pl?node_id=542341

So again, how would you characterize your documents? How many
fields? What do queries look like? How much physical memory on the
machine? How much memory have you allocated to the JVM?

You might review:
http://wiki.apache.org/solr/UsingMailingLists

Best,
Erick

On Thu, Jun 18, 2015 at 3:23 PM, wwang525 <wwang...@gmail.com> wrote:
> The query without load is still under 1 second. But under load, response time
> can be much longer due to the queued up query.
>
> We would like to shard the data to something like 6 M / shard, which will
> still give a under 1 second response time under load.
>
> What are some best practice to shard the data? for example, we could shard
> the data by date range, but that is pretty dynamic, and we could shard data
> by some other properties, but if the data is not evenly distributed, you may
> not be able shard it anymore.
>
>
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/How-to-do-a-Data-sharding-for-data-in-a-database-table-tp4212765p4212803.html
> Sent from the Solr - User mailing list archive at Nabble.com.

Re: How to do a Data sharding for data in a database table

Reply via email to