Hi,

The current hardware profile for our production cluster is 20 nodes, each
with 24cores and 256GB memory. Data being indexed is very structured in
nature and is about 30 columns or so, out of which half of them are
categorical with a defined list of values. The expected peak indexing
throughput is to be about *50000* documents per second (expected to be done
at off-peak hours so that search requests will be minimal during this time)
and the average throughput around *10000* documents (normal business
hours).

Given the hardware profile, is it realistic and practical to achieve the
desired throughput? What factors affect the performance of indexing apart
from the above hardware characteristics? I understand that its very
difficult to provide any guidance unless a prototype is done. But wondering
what are the considerations and dependencies we need to be aware of and
whether our throughput expectations are realistic or not.

Thanks

Reply via email to