Hi all

I recently joined the Lucene team at Amazon and this is my first time
working with Lucene so any help will be appreciated.

One of my first tasks is to *add a metric in production to track the RAM /
disk usage of vector fields*. We want to use this metric to decide when to
scale our deployments.

One of the ideas to get this data was to split the index files such that we
have separate files for each field and prefix filenames with the
field name. We could then analyze the index files and figure out how many
bytes are used for each field. However, this idea is called out as a bad
practice in Lucene docs (
https://github.com/apache/lucene/blob/main/dev-docs/file-formats.md#dont-use-too-many-files
)

Is there any other way to find out how many bytes are being used by vector
fields?

Thanks!

Tanmay

Reply via email to