You've repeated your original statement. Shawn's observation is that 10M docs is a very small corpus by Solr standards. You either have very demanding document/search combinations or you have a poorly tuned Solr installation.
On reasonable hardware I expect 25-50M documents to have sub-second response time. So what we're trying to do is be sure this isn't an "XY" problem, from Hossman's apache page: Your question appears to be an "XY Problem" ... that is: you are dealing with "X", you are assuming "Y" will help you, and you are asking about "Y" without giving more details about the "X" so that we can understand the full issue. Perhaps the best solution doesn't involve "Y" at all? See Also: http://www.perlmonks.org/index.pl?node_id=542341 So again, how would you characterize your documents? How many fields? What do queries look like? How much physical memory on the machine? How much memory have you allocated to the JVM? You might review: http://wiki.apache.org/solr/UsingMailingLists Best, Erick On Thu, Jun 18, 2015 at 3:23 PM, wwang525 <wwang...@gmail.com> wrote: > The query without load is still under 1 second. But under load, response time > can be much longer due to the queued up query. > > We would like to shard the data to something like 6 M / shard, which will > still give a under 1 second response time under load. > > What are some best practice to shard the data? for example, we could shard > the data by date range, but that is pretty dynamic, and we could shard data > by some other properties, but if the data is not evenly distributed, you may > not be able shard it anymore. > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/How-to-do-a-Data-sharding-for-data-in-a-database-table-tp4212765p4212803.html > Sent from the Solr - User mailing list archive at Nabble.com.