Hi, Wide characters are stored using utf-8, therefore characters taking up less than 7 bytes will take up the exact same about of space. i would stick to the wide character format if you don't have a compelling reason to use ascii.
reducing size: using Luke(http://code.google.com/p/luke/) will help you figure out what you've actually stored in the index. reducing the number of fields you 'STORE' helps a lot. There is a compressed type field in clucene - but it was a bit hard to get going until the latest versions (now it's just another flag - Field::STORE_COMPRESS). ben On Wed, Jun 8, 2011 at 2:07 AM, Teryl Taylor <teryl.tay...@gmail.com> wrote: > Hi everyone, > > I just had a quick question about search engine size. The search engine > takes everything as wide characters. Since everything I'm putting in the > database is ASCII, I thought I'd compile the search engine with ASCII Mode > on. This took the TCHAR and defined it as a char rather than a wchar_t. > When I recompiled everything, and ran it, the search engine database was the > exact same size as the original wide char. Anyone know why that is? I > would have thought using chars instead of wide chars would have reduced the > size. Am I missing a configuration? > > Also, does anyone have any tips on reducing the size of a search engine? > Lucene doesn't support a compression mechanism right? It's not that the > database is bloated or anything, it's just any size reduction I can get is > beneficial, so I'm just investigating ways to get it as small as possible. > > > Thanks, > > Teryl > > > > > > ------------------------------------------------------------------------------ > EditLive Enterprise is the world's most technically advanced content > authoring tool. Experience the power of Track Changes, Inline Image > Editing and ensure content is compliant with Accessibility Checking. > http://p.sf.net/sfu/ephox-dev2dev > _______________________________________________ > CLucene-developers mailing list > CLucene-developers@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/clucene-developers > > -- ------------------------------------- Ben van Klinken Mob: 0401 921847 Em: b...@villagechief.com ------------------------------------------------------------------------------ EditLive Enterprise is the world's most technically advanced content authoring tool. Experience the power of Track Changes, Inline Image Editing and ensure content is compliant with Accessibility Checking. http://p.sf.net/sfu/ephox-dev2dev _______________________________________________ CLucene-developers mailing list CLucene-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/clucene-developers