Duh... that was my issue. I am storing the content also. Sorry for the newbie question. I'll crawl back under my rock now.....
----- Original Message ----- From: "Peter Carlson" <[EMAIL PROTECTED]> To: "Lucene Users List" <[EMAIL PROTECTED]> Sent: Monday, May 20, 2002 8:50 PM Subject: Re: sanity check - index size > This seems big depending on what you are storing. > > For example, I have a set of data with 457MB and my Lucene index is 115MB. > However, I don't store much. > > If you are storing the complete text (even if you don't index it), then it > will be about the same size (no probably bigger) than your original data > set. > > --Peter > > On 5/20/02 4:16 PM, "Erik Hatcher" <[EMAIL PROTECTED]> wrote: > > > I'm indexing 900+ files (less than 1,000) that total about 15MB in size. > > These are text files and HTML files. I only index them into a few fields > > (title, content, filename). My index (specifically _sd.fdt) is 20MB. The > > bulk of the HTML files are Javadoc files (Ant's own documentation, > > actually). > > > > Does that seem at all close to being reasonable/normal? I am calling > > optimize() before closing the index. > > > > Thanks for the sanity check. > > > > Erik > > > > > > > > -- > > To unsubscribe, e-mail: <mailto:[EMAIL PROTECTED]> > > For additional commands, e-mail: <mailto:[EMAIL PROTECTED]> > > > > > > > -- > To unsubscribe, e-mail: <mailto:[EMAIL PROTECTED]> > For additional commands, e-mail: <mailto:[EMAIL PROTECTED]> > > -- To unsubscribe, e-mail: <mailto:[EMAIL PROTECTED]> For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>
