> 5. No more than 32 nodes in your SolrCloud cluster.

I hope this isn't too OT, but what tradeoffs is this based on? Would have
thought it easy to hit this number for a big index and high load (hence
with the view of both the number of shards and replicas horizontally
scaling..)

> 6. Don't return more than 250 results on a query.
>
> None of those is a hard limit, but don't go beyond them unless your Proof
of Concept testing proves that performance is acceptable for your situation.
>
> Start with a simple 4-node, 2-shard, 2-replica cluster for preliminary
tests and then scale as needed.
>
> Dynamic and multivalued fields? Try to stay away from them - excepts for
the simplest cases, they are usually an indicator of a weak data model.
Sure, it's fine to store a relatively small number of values in a
multivalued field (say, dozens of values), but be aware that you can't
directly access individual values, you can't tell which was matched on a
query, and you can't coordinate values between multiple multivalued fields.
Except for very simple cases, multivalued fields should be flattened into
multiple documents with a parent ID.
>
> Since you brought up the topic of dynamic fields, I am curious how you
got the impression that they were a good technique to use as a starting
point. They're fine for prototyping and hacking, and fine when used in
moderation, but not when used to excess. The whole point of Solr is
searching and searching is optimized within fields, not across fields, so
having lots of dynamic fields is counter to the primary strengths of Lucene
and Solr. And... schemas with lots  of dynamic fields tend to be difficult
to maintain. For example, if you wanted to ask a support question here, one
of the first things we want to know is what your schema looks like, but
with lots of dynamic fields it is not possible to have a simple discussion
of what your schema looks like.
>
> Sure, there is something called "schemaless design" (and Solr supports
that in 4.4), but that's very different from heavy reliance on dynamic
fields in the traditional sense. Schemaless design is A-OK, but using
dynamic fields for "arrays" of data in a single document is a poor match
for the search features of Solr (e.g., Edismax searching across multiple
fields.)
>
> One other tidbit: Although Solr does not enforce naming conventions for
field names, and you can put special characters in them, there are plenty
of features in Solr, such as the common "fl" parameter, where field names
are expected to adhere to Java naming rules. When people start "going wild"
with dynamic fields, it is common that they start "going wild" with their
names as well, using spaces, colons, slashes, etc. that cannot be parsed in
the "fl" and "qf" parameters, for example. Please don't go there!
>
> In short, put up a small cluster and start doing a Proof of Concept
cluster. Stay within my suggested guidelines and you should do okay.
>
> -- Jack Krupansky
>
> -----Original Message----- From: Marcelo Elias Del Valle
> Sent: Monday, July 08, 2013 9:46 AM
> To: solr-user@lucene.apache.org
> Subject: Solr limitations
>
>
> Hello everyone,
>
>    I am trying to search information about possible solr limitations I
> should consider in my architecture. Things like max number of dynamic
> fields, max number o documents in SolrCloud, etc.
>    Does anyone know where I can find this info?
>
> Best regards,
> --
> Marcelo Elias Del Valle
> http://mvalle.com - @mvallebr

Reply via email to