My id field definition on the schema is indexed=true docValues=true
stored=false useDocValuesAsStored=true
Can it cause this kind of behavior? id field not being stored?
On 2021-12-23 15:57, Jeff Courtade wrote:
Rule of thumb to reduce disk reads with solr is to ...
Tun linux
Have fast disk/ssd for indexes
Separate disk for os/prog, indexes, solr logs
Enough ram on the system to allow the system to load the entire index
into
ram
And have room for applications and os.
Linux will store files it has accessed in a ram buffer... this is the
buffers you see when looking at memory allocation.
Heap allocation is best left to Java 11 or higher.
If I were you I would rearchitect to get your indexes down to a more
manageable level and run more solr nodes with enough ram and fast disk.
If you can get 400+?40? Gigs of ram on your system it will help you
with io
issues.
On Thu, Dec 23, 2021, 7:19 AM Ufuk YILMAZ <[email protected]>
wrote:
I have a problem with my SolrCloud cluster which when I request a few
stored fields disk read rate caps at the maximum for a long period of
time, but when I request no fields response time is consistent with a
few seconds.
My cluster has 4 nodes. Total index size is 400GB per node. Each node
has 96GB ram, 24GB of it is allocated to Solr heap. All data is
persisted on SSD's.
These tests are done when no other reads are being made (iotop command
shows 0 read) but indexing is going on with 200kilobytes per second
being written to the disk.
When I send a query with rows=0 response time is consistent around 0.5
-
2 seconds.
But when I request rows=5000 and a single stored field (field type is
text_general with stored=true), response time jumps to a 3 - 10
minutes,
during which disk read is topped at 1000M/s (maximum my disks can do)
and stays at the top until request finishes. Document size is around
1-4KB and typical result set is 50-1000 docs. If I send a few requests
at the same time, it gets even worse and I start to get errors.
Why does Solr need to read hundreds of gigabytes of data to return a
few
hundred kilobytes of stored fields?
I have been reading on how index and stored fields are organised to
find
out if this is expected. If queries with rows=0 was slow too, I'd
simply
say the index is too big for my machines.
Do you have any pointers for this issue?
--uyilmaz