But does size on disk help? If the doc has a zillion images in it, those aren't part of the resulting index (I'm excluding stored data here)....
On Wed, Jul 4, 2018 at 7:49 AM, Terry Steichen <[email protected]> wrote: > In the document types I usually index (.pdf, .docx/.doc, .eml), there > exists a metadata field called "stream_size" that contains the size of > the document on disk. You don't have to compute it. Thus, when you > retrieve each document you can pull out the contents of this field and, > if you like, include it in each hitlist entry. > > > On 07/04/2018 05:26 AM, Chris and Helen Bamford wrote: >> Hi there, >> >> How can I calculate the total size of a Lucene Document that I'm about >> to write to an index so I know how many bytes I am writing please? I >> need it for some external metrics collection. >> >> Thanks >> >> - Chris >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: [email protected] >> For additional commands, e-mail: [email protected] >> >> > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
