But does size on disk help? If the doc has a zillion images in it, those aren't part of the resulting index (I'm excluding stored data here)....
On Wed, Jul 4, 2018 at 7:49 AM, Terry Steichen <te...@net-frame.com> wrote: > In the document types I usually index (.pdf, .docx/.doc, .eml), there > exists a metadata field called "stream_size" that contains the size of > the document on disk. You don't have to compute it. Thus, when you > retrieve each document you can pull out the contents of this field and, > if you like, include it in each hitlist entry. > > > On 07/04/2018 05:26 AM, Chris and Helen Bamford wrote: >> Hi there, >> >> How can I calculate the total size of a Lucene Document that I'm about >> to write to an index so I know how many bytes I am writing please? I >> need it for some external metrics collection. >> >> Thanks >> >> - Chris >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >> For additional commands, e-mail: java-user-h...@lucene.apache.org >> >> > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org