Hi,
We observed that when we use the setting compressed=true the index size is
around 0.66 times the actual log file, where as if we do not use any
compressed=true setting, the index size is almost as much as 2.6 times.
Our sample solr document size is approximately 1000 bytes. In addition to the
text data we have around 9 metadata tags associated to it.
We need to display all off the metadata values on the GUI, and hence we are
setting stored=true in our schema.xml
Now the question is, how the compressed=true flag impacts the indexing and
Querying operations. I am sure that there will be CPU utilization spikes as
there will be operation of compressing(during indexing) and
uncompressing(during querying) of the indexed data. I am mainly looking for any
bench marks for the above scenario.
The expected volumes of the data coming in would be approximately 400 GB of
data per day, so it is very important for us to evaluate the compressed=true,
due to the file system utilization and index sizing issues.
Any help would be greatly appreciated..
Thanks,
sS