Thanks for all the comments. For what's it worth, for what I'm doing file size is not a concern. Index performance is paramount. The index will be static, no adding or deleting, its read only.
On Fri, Feb 26, 2010 at 2:49 PM, Nicholas Paldino [.NET/C# MVP] < [email protected]> wrote: > Andrew, > > If you are going to unpack the index into a temp directory and then > repack the file when you are done, then you are going to instantiate a cost > on startup and on teardown of the process which is mainly I/O and CPU bound > (I/O because you have to read the zip file from disk and then write the > unpacked file from the zip to another location, and CPU bound because you > are translating the byte stream while unpacking). > > That approach doesn't do anything but add that additional I/O and > CPU overhead on startup. The "big win" for compressing the file is to save > space on disk, or whatever medium the byte stream is being persisted to. > > If all you do is unzip the file in the beginning and zip it up at > the end, then from your app's point of view, you do a lot of extra work for > nothing. Unless you have real disk space issues, I'd recommend against > this. > > Now, if you were to create a new Directory class which uses a > GZipStream or DeflateStream as a façade over the FileStream which writes to > disk, then you are reaping the benefits of compressing the file. The index > will always be compressed on disk and you are realizing the gains. > > The cost of doing this, however, is more CPU time (to perform the > translation) but with a gain on less I/O operations to disk (since there > are > less bytes that are being written to disk). > > Depending on how much activity you have on reading/writing to/from > the index it might or might not make an impact. You have to measure that > yourself given your applications use of the index. > > If file size is ^truly^ a concern, have you considered just setting > the compression flag on the *folder* that contains the index files? Any > files that are added/updated/deleted will automatically be compressed if > the > flag is set on the folder, so doing it in code is busywork when the OS > automatically provides it for you (assuming you are on Windows, which is a > safe bet given you are running .NET, but not absolute, of course). > > - Nick > > -----Original Message----- > From: Andrew Schuler [mailto:[email protected]] > Sent: Friday, February 26, 2010 4:48 PM > To: [email protected] > Subject: Re: Lucene index file container > > Thanks for both answers on this. > I considered a zip file but was unsure of the associated overhead of > unpacking file. Does any one have experience running an index directly out > of zip file? > Are my worries unfounded? I was just trying to leverage the experience of > the group, but otherwise I'll just have to run some tests on my own. > > > > On Fri, Feb 26, 2010 at 11:55 AM, Nicholas Petersen > <[email protected]>wrote: > > > <Can anyone recommend a way to package the index into say some type of > file > > container> > > > > If I understand correctly, it sounds like your asking for a text-book > > implementation of an archiver, like a zip file. If so, DotNetZip is a > > solid > > product, very easy to use, very fast. Highly recommended. > > http://www.codeplex.com/DotNetZip. > > > > Best, > > Nick > > > > > > > > On Fri, Feb 26, 2010 at 2:47 PM, Andrew Schuler < > [email protected] > > >wrote: > > > > > Yes, that is do-able. I was just thinking it would be cleaner to wrap > the > > > indexes (there will be more than one) in some sort of file container. > One > > > of > > > the things I'd like to do it be able to allow the user to download > > > pre-packaged indexes and load them into the app. This would be easy > with > > a > > > file than a directory of files no? > > > > > > > > > On Fri, Feb 26, 2010 at 11:41 AM, Hans Merkl <[email protected]> wrote: > > > > > > > Can't you add all the files in the index directory to the installer > > > > package? > > > > This should be pretty straightforward. > > > > > > > > -----Original Message----- > > > > From: Andrew Schuler [mailto:[email protected]] > > > > Sent: Friday, February 26, 2010 12:16 PM > > > > To: [email protected] > > > > Subject: Lucene index file container > > > > > > > > The discussion about encrypting an index has me thinking about a > > current > > > > use > > > > I have for Lucene.net. I'm building a small app with a static index > > > > distributed with it. Can anyone recommend a way to package the index > > into > > > > say some type of file container for inclusion in an installer > package? > > > > > > > > -andy > > > > > > > > > > > > > > > > > >
