bq: "That is really a job for streaming, not simple faceting.”
True, it’s the next step to improve our performance (right now we are using
JSON facets), and 6.3.0 has a lot of useful tools to work with streaming
expressions. Our last release before 6.3 was 5.3.1 and the streaming
expressions
Yago Riveiro wrote:
> One thing that I forget to mention is that my clients can aggregate
> by any field in the schema with limit=-1, this is not a problem with
> 99% of the fields, but 2 or 3 of them are URLs. URLs has very
> high cardinality and one of the reasons to
One thing that I forget to mention is that my clients can aggregate by any
field in the schema with limit=-1, this is not a problem with 99% of the
fields, but 2 or 3 of them are URLs. URLs has very high cardinality and one of
the reasons to sharding collections is to lower the memory footprint
Yago Riveiro wrtoe:
> My cluster holds more than 10B documents stored in 15T.
>
> The size of my collections is variable but I have collections with 800M
> documents distributed over the 12 nodes, the amount of documents per shard
> is ~66M and indeed the performance is
My cluster holds more than 10B documents stored in 15T.
The size of my collections is variable but I have collections with 800M
documents distributed over the 12 nodes, the amount of documents per shard
is ~66M and indeed the performance is good.
I need the collections to isolate the data of my
Right, so if I'm doing the math right you have 2,400 replicas per JVM?
I'm not clear whether each node has a single JVM or not.
Anyway. 2048 is indeed much too high. If nothing else, dropping it to,
say, 64 would show whether this was the real root of your problem or not.
Even if it slowed
Yes, I changed the value of coreLoadThreads.
With the default value a node takes like 40 minutes to be available with all
replicas up.
Right now I have ~1.2K collections with 12 shards each, 2 replicas spread in 12
nodes. Indeed the value I configured maybe is too much (2048) but I can start
Hmmm, have you changed coreLoadThreads? We had a problem with this a
while back with loading lots and lots of cores, see:
https://issues.apache.org/jira/browse/SOLR-7280
But that was fixed in 6.2, so unless you changed the number of threads
used to load cores it shouldn't be a problem on 6.3...