Andrew,
If that's the case, then you shouldn't be considering compressing
the index, it's just going to add overhead which you don't need.
- Nick
-----Original Message-----
From: Andrew Schuler [mailto:[email protected]]
Sent: Friday, February 26, 2010 6:18 PM
To: [email protected]
Subject: Re: Lucene index file container
Thanks for all the comments.
For what's it worth, for what I'm doing file size is not a concern. Index
performance is paramount. The index will be static, no adding or deleting,
its read only.
On Fri, Feb 26, 2010 at 2:49 PM, Nicholas Paldino [.NET/C# MVP] <
[email protected]> wrote:
> Andrew,
>
> If you are going to unpack the index into a temp directory and then
> repack the file when you are done, then you are going to instantiate a
cost
> on startup and on teardown of the process which is mainly I/O and CPU
bound
> (I/O because you have to read the zip file from disk and then write the
> unpacked file from the zip to another location, and CPU bound because you
> are translating the byte stream while unpacking).
>
> That approach doesn't do anything but add that additional I/O and
> CPU overhead on startup. The "big win" for compressing the file is to
save
> space on disk, or whatever medium the byte stream is being persisted to.
>
> If all you do is unzip the file in the beginning and zip it up at
> the end, then from your app's point of view, you do a lot of extra work
for
> nothing. Unless you have real disk space issues, I'd recommend against
> this.
>
> Now, if you were to create a new Directory class which uses a
> GZipStream or DeflateStream as a façade over the FileStream which writes
to
> disk, then you are reaping the benefits of compressing the file. The
index
> will always be compressed on disk and you are realizing the gains.
>
> The cost of doing this, however, is more CPU time (to perform the
> translation) but with a gain on less I/O operations to disk (since there
> are
> less bytes that are being written to disk).
>
> Depending on how much activity you have on reading/writing to/from
> the index it might or might not make an impact. You have to measure that
> yourself given your applications use of the index.
>
> If file size is ^truly^ a concern, have you considered just setting
> the compression flag on the *folder* that contains the index files? Any
> files that are added/updated/deleted will automatically be compressed if
> the
> flag is set on the folder, so doing it in code is busywork when the OS
> automatically provides it for you (assuming you are on Windows, which is a
> safe bet given you are running .NET, but not absolute, of course).
>
> - Nick
>
> -----Original Message-----
> From: Andrew Schuler [mailto:[email protected]]
> Sent: Friday, February 26, 2010 4:48 PM
> To: [email protected]
> Subject: Re: Lucene index file container
>
> Thanks for both answers on this.
> I considered a zip file but was unsure of the associated overhead of
> unpacking file. Does any one have experience running an index directly out
> of zip file?
> Are my worries unfounded? I was just trying to leverage the experience of
> the group, but otherwise I'll just have to run some tests on my own.
>
>
>
> On Fri, Feb 26, 2010 at 11:55 AM, Nicholas Petersen
> <[email protected]>wrote:
>
> > <Can anyone recommend a way to package the index into say some type of
> file
> > container>
> >
> > If I understand correctly, it sounds like your asking for a text-book
> > implementation of an archiver, like a zip file. If so, DotNetZip is a
> > solid
> > product, very easy to use, very fast. Highly recommended.
> > http://www.codeplex.com/DotNetZip.
> >
> > Best,
> > Nick
> >
> >
> >
> > On Fri, Feb 26, 2010 at 2:47 PM, Andrew Schuler <
> [email protected]
> > >wrote:
> >
> > > Yes, that is do-able. I was just thinking it would be cleaner to wrap
> the
> > > indexes (there will be more than one) in some sort of file container.
> One
> > > of
> > > the things I'd like to do it be able to allow the user to download
> > > pre-packaged indexes and load them into the app. This would be easy
> with
> > a
> > > file than a directory of files no?
> > >
> > >
> > > On Fri, Feb 26, 2010 at 11:41 AM, Hans Merkl <[email protected]> wrote:
> > >
> > > > Can't you add all the files in the index directory to the installer
> > > > package?
> > > > This should be pretty straightforward.
> > > >
> > > > -----Original Message-----
> > > > From: Andrew Schuler [mailto:[email protected]]
> > > > Sent: Friday, February 26, 2010 12:16 PM
> > > > To: [email protected]
> > > > Subject: Lucene index file container
> > > >
> > > > The discussion about encrypting an index has me thinking about a
> > current
> > > > use
> > > > I have for Lucene.net. I'm building a small app with a static index
> > > > distributed with it. Can anyone recommend a way to package the index
> > into
> > > > say some type of file container for inclusion in an installer
> package?
> > > >
> > > > -andy
> > > >
> > > >
> > > >
> > >
> >
>
smime.p7s
Description: S/MIME cryptographic signature
