Hi All,

I have started this thread for Lucene scalability aspect. I have an index with 
80 GB size. However it looks like many of the segment files are either 
redundant or unused. Even if I delete them and just retain CFS, segments and 
deletable files, the index seems to be working fine. However I want to know 
more cleaner approach to identify such redundant/unused files through APIs. I 
am able to see these unused files in Luke as "Deletable". However I am not sure 
how Luke is able to identify unused files. I am using Lucene.NET 2.0 version.

Can you please suggest some way?



-----Original Message-----
From: Granroth, Neal V. [mailto:neal.granr...@thermofisher.com]
Sent: Tuesday, January 13, 2009 1:01 AM
To: lucene-net-user@incubator.apache.org
Subject: RE: Lucene Scalability Options


Floyd, you will need to provide more details about the specific problems you 
are encountering.

I made a quick check, and have no difficulty opening and inspecting an index I 
created a few minutes ago with Lucene.NET v2.3.1 using Luke v0.9.1.

-- Neal


-----Original Message-----
From: Floyd Wu [mailto:floyd...@gmail.com]
Sent: Friday, January 09, 2009 8:18 PM
To: lucene-net-user@incubator.apache.org
Subject: Re: Lucene Scalability Options

Hi all,
It seems new version of Luke is not compitable with Lucene.net and I've
email to the creator of Luke. Below is feedback from him

"Yes, there have been many changes,
but Lucene 2.4 can still open indexes built with earlier versions of
Lucene/Java.
This is the second report I've got about the possible incompatibility with
Lucene.Net -
I suggest to raise up this issue on the Lucene mailing list (
java-...@lucene.apache.org),
and provide more details,
eg. Lucene.Net revision, stack trace, a small sample index if you can."

My original report as below
"The situation is Luke-0.9 can not open the index files which built by
Lucene.Net-2.3.1.
I tried to use older version of Luke and confirm Luke-0.8 and Luke-0.8.1 can
open and read index files fine.
 I wonder if there is any change between java Lucene 2.3 and 2.4.
Please help on this."

Floyd



2009/1/9 George Aroush <geo...@aroush.net>

> Hi Nitin,
>
> Any optimization that Luke can do on an index is also doable by making API
> calls from Lucene.Net.  If not, then there is either a bug in Lucene.Net or
> in your use of the API.  Can you share with us your API calls as well as
> the
> Lucene.Net version you are using?
>
> Thanks.
>
> -- George
>
> > -----Original Message-----
> > From: Nitin Shiralkar [mailto:nit...@coreobjects.com]
>  > Sent: Friday, January 09, 2009 6:27 AM
> > To: lucene-net-user@incubator.apache.org
> > Subject: RE: Lucene Scalability Options
> >
> > Thanks Hugh. Yes, I tried using Luke for index optimization.
> > Surprisingly, it has brought down the index size to ~20 GB
> > with only one CFS and segment files left behind. I used
> > compound optimization option. But I use the similar
> > "SetUseCompoundFile" property on "IndexModifier" object in my
> > Lucene.NET code, but it has no effect on size or files after
> > optimization. Any suggestions??
> >
> >
> > -----Original Message-----
> > From: Hugh Spiller [mailto:hugh.spil...@renishaw.com]
> > Sent: Friday, January 09, 2009 3:35 PM
> > To: lucene-net-user@incubator.apache.org
> > Subject: RE: Lucene Scalability Options
> >
> > Hi Nitin,
> >
> > I've found the easiest way to get rid of redundant files in
> > an index is to use Luke. As soon as you use it to open the
> > index, it tidies up all the cruft.
> >
> > It's at http://www.getopt.org/luke/ .
> >
> > ________________________________
> >
> > Hugh Spiller
> >
> >
> > -----Original Message-----
> > From: Nitin Shiralkar [mailto:nit...@coreobjects.com]
> > Sent: 09 January 2009 08:48
> > To: lucene-net-user@incubator.apache.org
> > Subject: RE: Lucene Scalability Options
> >
> > -- snip --
> >
> >
> > Any inputs on junk/redundant files in above list?
> >
> >
> >
> > --------------------------------------------------------------
> > ------------------------------------
> > This email and any attachments are confidential and are for
> > the use of the addressee only. If you are not the addressee,
> > you must not use or disclose the contents to any other
> > person. Please immediately notify the sender and delete the
> > email. Statements and opinions expressed here may not
> > represent those of the company. Email correspondence is
> > monitored by the company. This information may be subject to
> > Export Control Regulation. You are obliged to comply with
> > such Regulations
> >
> > The parent company of the Renishaw Group is Renishaw plc,
> > registered in England no. 1106260. Registered Office: New
> > Mills, Wotton-under-Edge, Gloucestershire, GL12 8JR, United
> > Kingdom. Tel +44 (0) 1453 524524
> > --------------------------------------------------------------
> > ------------------------------------
> >
>
>

Reply via email to