Re: Sys properties Was: java.io.tmpdir as lock dir .... once again

2004-03-18 Thread Stephane James Vaucher
Thanks, I was going to submit a patch for this feature this week-end.

sv

On Thu, 18 Mar 2004, Otis Gospodnetic wrote:

> I added support for all items listed below, except commit/write lock
> file name.  I don't see why one would want to change that, considering
> those files are still limited to the index directory.
> 
> Otis
> 
> --- Stephane James Vaucher <[EMAIL PROTECTED]> wrote:
> > How about (looking big rather than small):
> > 
> > - MaxClause from BooleanQuery (I know there has been discussions on 
> > the dev list, but I haven't been following it)
> > - default commit_lock_name
> > - default commit_lock_timeout
> > - default maxFieldLength
> > - default maxMergeDocs
> > - default mergeFactor
> > - default minMergeDocs
> > - default write_lock_name
> > - default write_lock_timeout
> > 
> > I'm currently configuring parts of my app using sys properties, 
> > particularly the mergeFactor because my prod system has 2GB of RAM
> > and is 
> > windows based and my dev machine has 256MB and is linux. If no one
> > takes a 
> > crack at this, I'll see what I can do in 2 weeks, after my vacations.
> > 
> > Cheers,
> > sv
> > 
> > On Wed, 3 Mar 2004, Doug Cutting wrote:
> > 
> > > Stephane James Vaucher wrote:
> > > > As I've stated in my earlier mail, I like this change. More
> > importantly, 
> > > > could this become a "standard" way of changing configurations at
> > runtime? 
> > > > For example, the default merge factor could also be set in this
> > manner.
> > > 
> > > Sure, that's reasonable, so this would be something like:
> > > 
> > > private static final int DEFAULT_MERGE_FACTOR =
> > >  
> > >
> >
> Integer.parseInt(System.getProperty("org.apache.lucene.mergeFactor","10"));
> > > 
> > > In IndexWriter.java.
> > > 
> > > What other candidates are there for this treatment?
> > > 
> > > Doug
> > > 
> > >
> > -
> > > To unsubscribe, e-mail: [EMAIL PROTECTED]
> > > For additional commands, e-mail:
> > [EMAIL PROTECTED]
> > > 
> > 
> > 
> > -
> > To unsubscribe, e-mail: [EMAIL PROTECTED]
> > For additional commands, e-mail: [EMAIL PROTECTED]
> > 
> 
> 
> -
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> 


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Sys properties Was: java.io.tmpdir as lock dir .... once again

2004-03-18 Thread Otis Gospodnetic
I added support for all items listed below, except commit/write lock
file name.  I don't see why one would want to change that, considering
those files are still limited to the index directory.

Otis

--- Stephane James Vaucher <[EMAIL PROTECTED]> wrote:
> How about (looking big rather than small):
> 
> - MaxClause from BooleanQuery (I know there has been discussions on 
> the dev list, but I haven't been following it)
> - default commit_lock_name
> - default commit_lock_timeout
> - default maxFieldLength
> - default maxMergeDocs
> - default mergeFactor
> - default minMergeDocs
> - default write_lock_name
> - default write_lock_timeout
> 
> I'm currently configuring parts of my app using sys properties, 
> particularly the mergeFactor because my prod system has 2GB of RAM
> and is 
> windows based and my dev machine has 256MB and is linux. If no one
> takes a 
> crack at this, I'll see what I can do in 2 weeks, after my vacations.
> 
> Cheers,
> sv
> 
> On Wed, 3 Mar 2004, Doug Cutting wrote:
> 
> > Stephane James Vaucher wrote:
> > > As I've stated in my earlier mail, I like this change. More
> importantly, 
> > > could this become a "standard" way of changing configurations at
> runtime? 
> > > For example, the default merge factor could also be set in this
> manner.
> > 
> > Sure, that's reasonable, so this would be something like:
> > 
> > private static final int DEFAULT_MERGE_FACTOR =
> >  
> >
>
Integer.parseInt(System.getProperty("org.apache.lucene.mergeFactor","10"));
> > 
> > In IndexWriter.java.
> > 
> > What other candidates are there for this treatment?
> > 
> > Doug
> > 
> >
> -
> > To unsubscribe, e-mail: [EMAIL PROTECTED]
> > For additional commands, e-mail:
> [EMAIL PROTECTED]
> > 
> 
> 
> -
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> 


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: Sys properties Was: java.io.tmpdir as lock dir .... once again

2004-03-09 Thread hui
(ms) 16
39772 total within(ms) 32
18645 total within(ms) 32
995 total within(ms) 47
6 open index time:47
28367 total within(ms) 15
37970 total within(ms) 31
45169 total within(ms) 46
21168 total within(ms) 31
1112 total within(ms) 31
7 open index time:31
31424 total within(ms) 31
42002 total within(ms) 16
49994 total within(ms) 31
23432 total within(ms) 32
1223 total within(ms) 47
8 open index time:46
33895 total within(ms) 32
45292 total within(ms) 47
53957 total within(ms) 47
25230 total within(ms) 32
1352 total within(ms) 47
9 open index time:63
37320 total within(ms) 31
49922 total within(ms) 15
59412 total within(ms) 47
27830 total within(ms) 31
1474 total within(ms) 62
10 open index time:984
38475 total within(ms) 16
51552 total within(ms) 16
61389 total within(ms) 47
28638 total within(ms) 46
1530 total within(ms) 157

-Original Message-
From: Doug Cutting [mailto:[EMAIL PROTECTED] 
Sent: Monday, March 08, 2004 1:16 PM
To: Lucene Users List
Subject: Re: Sys properties Was: java.io.tmpdir as lock dir  once again

hui wrote:
> Index time: 
> compound format is 89 seconds slower.
> 
> compound format:
> 1389507 total milliseconds
> non-compound format:
> 1300534 total milliseconds
> 
> The index size is 85m with 4 fields only. The files are stored in the
index.
> The compound format has only 3 files and the other has 13 files. 

Thanks for performing this benchmark!

It looks like the compound format is around 7% slower when indexing.  To 
my thinking that's acceptable, given the dramatic reduction in file 
handles.  If folks really need maximal indexing performance, then they 
can explicitly disable the compound format.

Would anyone object to making compound format the default for Lucene 
1.4?  This is an incompatible change, but I don't think it should break 
applications.

Doug

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Sys properties Was: java.io.tmpdir as lock dir .... once again

2004-03-08 Thread Terry Steichen
I tend to agree (but with the same uncertainty as to why I feel that way).

Regards,

Terry
- Original Message - 
From: "Otis Gospodnetic" <[EMAIL PROTECTED]>
To: "Lucene Users List" <[EMAIL PROTECTED]>
Sent: Monday, March 08, 2004 2:34 PM
Subject: Re: Sys properties Was: java.io.tmpdir as lock dir  once again


> I can't explain why, but I feel like the old index format should stay
> by default.  I feel like I'd rather a (slightly) faster index, and
> switch to the compound one when/IF I encounter problems, than have a
> safer, but slower index, and never realize that there is a faster
> option available.
> 
> Weak argument, I know, but some instinct in me thinks that the current
> mode should remain.
> 
> Otis
> 
> 
> --- Doug Cutting <[EMAIL PROTECTED]> wrote:
> > hui wrote:
> > > Index time: 
> > > compound format is 89 seconds slower.
> > > 
> > > compound format:
> > > 1389507 total milliseconds
> > > non-compound format:
> > > 1300534 total milliseconds
> > > 
> > > The index size is 85m with 4 fields only. The files are stored in
> > the index.
> > > The compound format has only 3 files and the other has 13 files. 
> > 
> > Thanks for performing this benchmark!
> > 
> > It looks like the compound format is around 7% slower when indexing. 
> > To 
> > my thinking that's acceptable, given the dramatic reduction in file 
> > handles.  If folks really need maximal indexing performance, then
> > they 
> > can explicitly disable the compound format.
> > 
> > Would anyone object to making compound format the default for Lucene 
> > 1.4?  This is an incompatible change, but I don't think it should
> > break 
> > applications.
> > 
> > Doug
> > 
> > -
> > To unsubscribe, e-mail: [EMAIL PROTECTED]
> > For additional commands, e-mail: [EMAIL PROTECTED]
> > 
> 
> 
> -
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> 
> 

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Sys properties Was: java.io.tmpdir as lock dir .... once again

2004-03-08 Thread Otis Gospodnetic
I can't explain why, but I feel like the old index format should stay
by default.  I feel like I'd rather a (slightly) faster index, and
switch to the compound one when/IF I encounter problems, than have a
safer, but slower index, and never realize that there is a faster
option available.

Weak argument, I know, but some instinct in me thinks that the current
mode should remain.

Otis


--- Doug Cutting <[EMAIL PROTECTED]> wrote:
> hui wrote:
> > Index time: 
> > compound format is 89 seconds slower.
> > 
> > compound format:
> > 1389507 total milliseconds
> > non-compound format:
> > 1300534 total milliseconds
> > 
> > The index size is 85m with 4 fields only. The files are stored in
> the index.
> > The compound format has only 3 files and the other has 13 files. 
> 
> Thanks for performing this benchmark!
> 
> It looks like the compound format is around 7% slower when indexing. 
> To 
> my thinking that's acceptable, given the dramatic reduction in file 
> handles.  If folks really need maximal indexing performance, then
> they 
> can explicitly disable the compound format.
> 
> Would anyone object to making compound format the default for Lucene 
> 1.4?  This is an incompatible change, but I don't think it should
> break 
> applications.
> 
> Doug
> 
> -
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> 


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Sys properties Was: java.io.tmpdir as lock dir .... once again

2004-03-08 Thread Doug Cutting
hui wrote:
Index time: 
compound format is 89 seconds slower.

compound format:
1389507 total milliseconds
non-compound format:
1300534 total milliseconds
The index size is 85m with 4 fields only. The files are stored in the index.
The compound format has only 3 files and the other has 13 files. 
Thanks for performing this benchmark!

It looks like the compound format is around 7% slower when indexing.  To 
my thinking that's acceptable, given the dramatic reduction in file 
handles.  If folks really need maximal indexing performance, then they 
can explicitly disable the compound format.

Would anyone object to making compound format the default for Lucene 
1.4?  This is an incompatible change, but I don't think it should break 
applications.

Doug

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


RE: Sys properties Was: java.io.tmpdir as lock dir .... once again

2004-03-08 Thread hui
Thank you, the converting option from Luke is really helpful for migrate
existing user index.
Regards,
Hui

-Original Message-
From: Andrzej Bialecki [mailto:[EMAIL PROTECTED] 
Sent: Monday, March 08, 2004 10:57 AM
To: Lucene Users List
Subject: Re: Sys properties Was: java.io.tmpdir as lock dir  once again

hui wrote:

> 
> 
> 
> Hi,
> 
> Here is the indexing performance testing result for the two index formats.

A shameless plug: you can use Luke (http://www.getopt.org/luke) to 
convert the same index between compound/non-compound formats. Which 
could be useful to rule out any possible differences in the 
indexing/inserting process between the runs. Luke provides you also with 
a simple time measurement for query execution. Just FYI.

-- 
Best regards,
Andrzej Bialecki

-
Software Architect, System Integration Specialist
CEN/ISSS EC Workshop, ECIMF project chair
EU FP6 E-Commerce Expert/Evaluator
-
FreeBSD developer (http://www.freebsd.org)


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Sys properties Was: java.io.tmpdir as lock dir .... once again

2004-03-08 Thread Andrzej Bialecki
hui wrote:



Hi,

Here is the indexing performance testing result for the two index formats.
A shameless plug: you can use Luke (http://www.getopt.org/luke) to 
convert the same index between compound/non-compound formats. Which 
could be useful to rule out any possible differences in the 
indexing/inserting process between the runs. Luke provides you also with 
a simple time measurement for query execution. Just FYI.

--
Best regards,
Andrzej Bialecki
-
Software Architect, System Integration Specialist
CEN/ISSS EC Workshop, ECIMF project chair
EU FP6 E-Commerce Expert/Evaluator
-
FreeBSD developer (http://www.freebsd.org)
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


RE: Sys properties Was: java.io.tmpdir as lock dir .... once again

2004-03-08 Thread hui




Hi,

Here is the indexing performance testing result for the two index formats.


1000 megahertz Intel Pentium III (2 installed)
32 kilobyte primary memory cache
256 kilobyte secondary memory cache

SCSI Hard drive 145.45 GB  
RAm 3G

Windows 2000 Advanced Server, Service Pack 2

JDK 140
JVM memory 512m

Indexed files: local 66100 local text files around 400m

Index time: 
compound format is 89 seconds slower.

compound format:
1389507 total milliseconds
non-compound format:
1300534 total milliseconds

The index size is 85m with 4 fields only. The files are stored in the index.
The compound format has only 3 files and the other has 13 files. 

Search Time (with only top 10 retrieved, no indexing at the same time, only
one thread search, indices are optimized and opened)
Do not see too much constant difference for the simple situation.

compound format:
Query: iraq
4275 total within(ms) 110
Query: war
5728 total within(ms) 0
Query: iraq AND war
3182 total within(ms) 16

non-compound format:
Query: war
5728 total within(ms) 125
Query: iraq war
6821 total within(ms) 31
Query: iraq AND war
3182 total within(ms) 0



-Original Message-
From: Doug Cutting [mailto:[EMAIL PROTECTED] 
Sent: Thursday, March 04, 2004 11:54 AM
To: Lucene Users List
Subject: Re: Sys properties Was: java.io.tmpdir as lock dir  once again

hui wrote:
> Not yet. For the compound file format, when the files get bigger, if I add
> few new files frequently, the bigger files has to be updated. Will that
> affect lot on the search and produce heavier disk I/O compared with the
> traditional index format? It seems OS cache makes quite difference when
the
> files not changed differently.

The compound format slows indexing performance slightly, but should not 
affect search performance much.  It radically reduces the number of file 
handles used when searching, by a factor of eight or more, depending on 
how many indexed fields you have.

Perhaps the compound format should be the default format in 1.4.  Can 
folks provide any benchmarks for how it affects performance?

Doug

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Sys properties Was: java.io.tmpdir as lock dir .... once again

2004-03-04 Thread Doug Cutting
hui wrote:
Not yet. For the compound file format, when the files get bigger, if I add
few new files frequently, the bigger files has to be updated. Will that
affect lot on the search and produce heavier disk I/O compared with the
traditional index format? It seems OS cache makes quite difference when the
files not changed differently.
The compound format slows indexing performance slightly, but should not 
affect search performance much.  It radically reduces the number of file 
handles used when searching, by a factor of eight or more, depending on 
how many indexed fields you have.

Perhaps the compound format should be the default format in 1.4.  Can 
folks provide any benchmarks for how it affects performance?

Doug

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


RE: Sys properties Was: java.io.tmpdir as lock dir .... once again

2004-03-04 Thread hui
Not yet. For the compound file format, when the files get bigger, if I add
few new files frequently, the bigger files has to be updated. Will that
affect lot on the search and produce heavier disk I/O compared with the
traditional index format? It seems OS cache makes quite difference when the
files not changed differently.

Regards,
Hui

-Original Message-
From: Erik Hatcher [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, March 03, 2004 9:21 PM
To: Lucene Users List
Subject: Re: Sys properties Was: java.io.tmpdir as lock dir  once again


On Mar 3, 2004, at 4:25 PM, hui wrote:
> Anoterh similar issue. If we could have a parameter to control the max
> number of the files within the index, that is going to avoid the 
> problem of
> running of the file handler issue.
> When the file number within one index reaches the limit, optimization 
> is
> going to be called.
> Right now, if the file number within one index out of the limit of your
> window system, you lost the index.
> Thank you for the consideration.

Have you tried using the compound file format introduced in 1.3?



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Sys properties Was: java.io.tmpdir as lock dir .... once again

2004-03-03 Thread Erik Hatcher
On Mar 3, 2004, at 4:25 PM, hui wrote:
Anoterh similar issue. If we could have a parameter to control the max
number of the files within the index, that is going to avoid the 
problem of
running of the file handler issue.
When the file number within one index reaches the limit, optimization 
is
going to be called.
Right now, if the file number within one index out of the limit of your
window system, you lost the index.
Thank you for the consideration.
Have you tried using the compound file format introduced in 1.3?



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


RE: Sys properties Was: java.io.tmpdir as lock dir .... once again

2004-03-03 Thread hui
Anoterh similar issue. If we could have a parameter to control the max
number of the files within the index, that is going to avoid the problem of
running of the file handler issue.
When the file number within one index reaches the limit, optimization is
going to be called.
Right now, if the file number within one index out of the limit of your
window system, you lost the index.
Thank you for the consideration.

Regards,
hui

-Original Message-
From: Doug Cutting [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, March 03, 2004 3:46 PM
To: Lucene Users List
Subject: Re: Sys properties Was: java.io.tmpdir as lock dir  once again

Stephane James Vaucher wrote:
> As I've stated in my earlier mail, I like this change. More importantly, 
> could this become a "standard" way of changing configurations at runtime? 
> For example, the default merge factor could also be set in this manner.

Sure, that's reasonable, so this would be something like:

private static final int DEFAULT_MERGE_FACTOR =
 
Integer.parseInt(System.getProperty("org.apache.lucene.mergeFactor","10"));

In IndexWriter.java.

What other candidates are there for this treatment?

Doug

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Sys properties Was: java.io.tmpdir as lock dir .... once again

2004-03-03 Thread Stephane James Vaucher
How about (looking big rather than small):

- MaxClause from BooleanQuery (I know there has been discussions on 
the dev list, but I haven't been following it)
- default commit_lock_name
- default commit_lock_timeout
- default maxFieldLength
- default maxMergeDocs
- default mergeFactor
- default minMergeDocs
- default write_lock_name
- default write_lock_timeout

I'm currently configuring parts of my app using sys properties, 
particularly the mergeFactor because my prod system has 2GB of RAM and is 
windows based and my dev machine has 256MB and is linux. If no one takes a 
crack at this, I'll see what I can do in 2 weeks, after my vacations.

Cheers,
sv

On Wed, 3 Mar 2004, Doug Cutting wrote:

> Stephane James Vaucher wrote:
> > As I've stated in my earlier mail, I like this change. More importantly, 
> > could this become a "standard" way of changing configurations at runtime? 
> > For example, the default merge factor could also be set in this manner.
> 
> Sure, that's reasonable, so this would be something like:
> 
> private static final int DEFAULT_MERGE_FACTOR =
>  
> Integer.parseInt(System.getProperty("org.apache.lucene.mergeFactor","10"));
> 
> In IndexWriter.java.
> 
> What other candidates are there for this treatment?
> 
> Doug
> 
> -
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> 


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Sys properties Was: java.io.tmpdir as lock dir .... once again

2004-03-03 Thread Doug Cutting
Stephane James Vaucher wrote:
As I've stated in my earlier mail, I like this change. More importantly, 
could this become a "standard" way of changing configurations at runtime? 
For example, the default merge factor could also be set in this manner.
Sure, that's reasonable, so this would be something like:

private static final int DEFAULT_MERGE_FACTOR =

Integer.parseInt(System.getProperty("org.apache.lucene.mergeFactor","10"));

In IndexWriter.java.

What other candidates are there for this treatment?

Doug

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Sys properties Was: java.io.tmpdir as lock dir .... once again

2004-03-03 Thread Stephane James Vaucher
As I've stated in my earlier mail, I like this change. More importantly, 
could this become a "standard" way of changing configurations at runtime? 
For example, the default merge factor could also be set in this manner.

sv

On Wed, 3 Mar 2004, Michael Duval wrote:

> 
> I agree with both the property name change and also making it static.
> 
> Mike
> 
> Doug Cutting wrote:
> 
> > Michael Duval wrote:
> >  > I've hacked the code for the time being by updating FSDirectory and
> >
> >> replaced all System.getProperty("java.io.tmpdir")
> >> calls with a call to a new method "getLockDir()".   This method 
> >> checks for a "lucene.lockdir" prop before the
> >> "java.io.tmpdir" prop giving the end user a bit more flexibility in 
> >> where locks are stored.
> >
> >
> > In general, I support this change.
> >
> >> Here is the method:
> >>
> >>  /** Allow flexible locking directories - Michael R. Duval 3/02/04 */
> >>  private String getLockDir() {
> >>String lockDir;
> >>
> >>if ((lockDir = System.getProperty("lucene.lockdir")) == null)
> >>return System.getProperty("java.io.tmpdir");
> >>else
> >>return  lockDir;
> >>  }
> >
> >
> > In particular, I have some quibbles.  The property should be named 
> > something like "org.apache.lucene.lockdir", not just "lucene.lockdir". 
> > And there's no reason to look it up each time: it can just be a static.
> >
> > private static final String LOCK_DIR =
> >   System.getProperty("org.apache.lucene.lockdir",
> >  System.getProperty("java.io.tmpdir"));
> >
> > Doug
> >
> > -
> > To unsubscribe, e-mail: [EMAIL PROTECTED]
> > For additional commands, e-mail: [EMAIL PROTECTED]
> >
> 
> 
> 
> -
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> 


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: java.io.tmpdir as lock dir .... once again

2004-03-03 Thread Michael Duval
I agree with both the property name change and also making it static.

Mike

Doug Cutting wrote:

Michael Duval wrote:
 > I've hacked the code for the time being by updating FSDirectory and
replaced all System.getProperty("java.io.tmpdir")
calls with a call to a new method "getLockDir()".   This method 
checks for a "lucene.lockdir" prop before the
"java.io.tmpdir" prop giving the end user a bit more flexibility in 
where locks are stored.


In general, I support this change.

Here is the method:

 /** Allow flexible locking directories - Michael R. Duval 3/02/04 */
 private String getLockDir() {
   String lockDir;
   if ((lockDir = System.getProperty("lucene.lockdir")) == null)
   return System.getProperty("java.io.tmpdir");
   else
   return  lockDir;
 }


In particular, I have some quibbles.  The property should be named 
something like "org.apache.lucene.lockdir", not just "lucene.lockdir". 
And there's no reason to look it up each time: it can just be a static.

private static final String LOCK_DIR =
  System.getProperty("org.apache.lucene.lockdir",
 System.getProperty("java.io.tmpdir"));
Doug

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re: java.io.tmpdir as lock dir .... once again

2004-03-03 Thread Doug Cutting
Michael Duval wrote:
 > I've hacked the code for the time being by updating FSDirectory and
replaced all System.getProperty("java.io.tmpdir")
calls with a call to a new method "getLockDir()".   This method checks 
for a "lucene.lockdir" prop before the
"java.io.tmpdir" prop giving the end user a bit more flexibility in 
where locks are stored.
In general, I support this change.

Here is the method:

 /** Allow flexible locking directories - Michael R. Duval 3/02/04 */
 private String getLockDir() {
   String lockDir;
   if ((lockDir = System.getProperty("lucene.lockdir")) == null)
   return System.getProperty("java.io.tmpdir");
   else
   return  lockDir;
 }
In particular, I have some quibbles.  The property should be named 
something like "org.apache.lucene.lockdir", not just "lucene.lockdir". 
And there's no reason to look it up each time: it can just be a static.

private static final String LOCK_DIR =
  System.getProperty("org.apache.lucene.lockdir",
 System.getProperty("java.io.tmpdir"));
Doug

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re: java.io.tmpdir as lock dir .... once again

2004-03-02 Thread Morus Walter
Otis Gospodnetic writes:
> This looks nice.
> However, what happens if you have two Java processes that work on the
> same index, and give it different lock directories?
> They'll mess up the index.
> 
Is that different to having two java processes using different java.io.tempdir?
I had that problem (one running in tomcat and one from the command line).
I don't think that making the need to choose the same directory for the
lock more explicit will increase the problems.

Morus

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: java.io.tmpdir as lock dir .... once again

2004-03-02 Thread Matt Quail
I had to do something similar to make the application works with
lucene 1.3 final when upgrading from 1.3 RC1. I think it is better to
maintain back compatiable so existing users are not affected too much
when a new release is available.
I'd like to "me too" this sentiment. That change caused me a little churn.

=Matt



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


RE: java.io.tmpdir as lock dir .... once again

2004-03-02 Thread Hui Ouyang
I had to do something similar to make the application works with lucene 1.3 final when 
upgrading from 1.3 RC1. I think it is better to maintain back compatiable so existing 
users are not affected too much when a new release is available.
Regards,
Hui

-Original Message- 
From: Michael Duval [mailto:[EMAIL PROTECTED] 
Sent: Tue 3/2/2004 4:11 PM 
To: [EMAIL PROTECTED] 
Cc: 
Subject: java.io.tmpdir as lock dir  once again




Hello All,

I've come across my first gotcha with the system property
"java.io.tmpdir" as the lock directory.

Over here at APS we run lucene in two different servlet containers on
two different servers for both performance
and security reasons.  One container gives read access to the collection
and the other is contantly updating the collection.
The collection is NFS mounted from both servers.   This worked fine
until the lucene update 1.3.   Now the lock files are being
written to the temp dir's in each of the respective containers root
dir's.   This of course breaks the locking scheme.

I could have changed the tmpdir prop to write files back into the
collection directory but this would also pollute
the tmpdir with other non-related files.  My solution was as follows:

I've hacked the code for the time being by updating FSDirectory and
replaced all System.getProperty("java.io.tmpdir")
calls with a call to a new method "getLockDir()".   This method checks
for a "lucene.lockdir" prop before the
"java.io.tmpdir" prop giving the end user a bit more flexibility in
where locks are stored.

Here is the method:

  /** Allow flexible locking directories - Michael R. Duval 3/02/04 */
  private String getLockDir() {
String lockDir;

if ((lockDir = System.getProperty("lucene.lockdir")) == null)
return System.getProperty("java.io.tmpdir");
else
return  lockDir;
  }

Hopefully a solution similar to this will make it in to one of the next
distributions.

Thanks and Cheers,

Mike

--
Michael R. Duval <[EMAIL PROTECTED] >
E-Journal Programmer/Analyst
The American Physical Society
1 Research Road
Ridge, NY 11961

www.aps.org
631 591 4127



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]





Re: java.io.tmpdir as lock dir .... once again

2004-03-02 Thread Stephane James Vaucher
I've done something similar to configure my merge factor (but it was
outside my code), and am planning on setting the limit on boolean queries
this way as well. I think it's pretty clean especially if you use
org.apache.lucene.xxx properties with decent default values.

Adding this feature could probably better document the hazards of use an
index lock in a distributed system, considering many people like to know
the implications of running lucene in a web (and potentially replicated)
env.

my 2c,
sv

On Tue, 2 Mar 2004, Otis Gospodnetic wrote:

> This looks nice.
> However, what happens if you have two Java processes that work on the
> same index, and give it different lock directories?
> They'll mess up the index.

If you sell people coffee, they can always burn themselves. Might as well
warn them.

> Should we try to prevent this by not offering this option, or should we
> offer it, document it well, and leave it up to the user to play by the
> rules or not?
>
> I'm leaning towards the latter, but I think some Lucene developers
> would be more conservative.
>
> Otis
>
>
> --- Michael Duval <[EMAIL PROTECTED]> wrote:
> >
> > Hello All,
> >
> > I've come across my first gotcha with the system property
> > "java.io.tmpdir" as the lock directory.
> >
> > Over here at APS we run lucene in two different servlet containers on
> >
> > two different servers for both performance
> > and security reasons.  One container gives read access to the
> > collection
> > and the other is contantly updating the collection.
> > The collection is NFS mounted from both servers.   This worked fine
> > until the lucene update 1.3.   Now the lock files are being
> > written to the temp dir's in each of the respective containers root
> > dir's.   This of course breaks the locking scheme.
> >
> > I could have changed the tmpdir prop to write files back into the
> > collection directory but this would also pollute
> > the tmpdir with other non-related files.  My solution was as follows:
> >
> > I've hacked the code for the time being by updating FSDirectory and
> > replaced all System.getProperty("java.io.tmpdir")
> > calls with a call to a new method "getLockDir()".   This method
> > checks
> > for a "lucene.lockdir" prop before the
> > "java.io.tmpdir" prop giving the end user a bit more flexibility in
> > where locks are stored.
> >
> > Here is the method:
> >
> >   /** Allow flexible locking directories - Michael R. Duval 3/02/04
> > */
> >   private String getLockDir() {
> > String lockDir;
> >
> > if ((lockDir = System.getProperty("lucene.lockdir")) == null)
> > return System.getProperty("java.io.tmpdir");
> > else
> > return  lockDir;
> >   }
> >
> > Hopefully a solution similar to this will make it in to one of the
> > next
> > distributions.
> >
> > Thanks and Cheers,
> >
> > Mike
> >
> > --
> > Michael R. Duval <[EMAIL PROTECTED] >
> > E-Journal Programmer/Analyst
> > The American Physical Society
> > 1 Research Road
> > Ridge, NY 11961
> >
> > www.aps.org
> > 631 591 4127
> >
> >
> >
> > -
> > To unsubscribe, e-mail: [EMAIL PROTECTED]
> > For additional commands, e-mail: [EMAIL PROTECTED]
> >
>
>
> -
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: java.io.tmpdir as lock dir .... once again

2004-03-02 Thread Otis Gospodnetic
This looks nice.
However, what happens if you have two Java processes that work on the
same index, and give it different lock directories?
They'll mess up the index.

Should we try to prevent this by not offering this option, or should we
offer it, document it well, and leave it up to the user to play by the
rules or not?

I'm leaning towards the latter, but I think some Lucene developers
would be more conservative.

Otis


--- Michael Duval <[EMAIL PROTECTED]> wrote:
> 
> Hello All,
> 
> I've come across my first gotcha with the system property 
> "java.io.tmpdir" as the lock directory.
> 
> Over here at APS we run lucene in two different servlet containers on
> 
> two different servers for both performance
> and security reasons.  One container gives read access to the
> collection 
> and the other is contantly updating the collection.
> The collection is NFS mounted from both servers.   This worked fine 
> until the lucene update 1.3.   Now the lock files are being
> written to the temp dir's in each of the respective containers root 
> dir's.   This of course breaks the locking scheme.
> 
> I could have changed the tmpdir prop to write files back into the 
> collection directory but this would also pollute
> the tmpdir with other non-related files.  My solution was as follows:
> 
> I've hacked the code for the time being by updating FSDirectory and 
> replaced all System.getProperty("java.io.tmpdir")
> calls with a call to a new method "getLockDir()".   This method
> checks 
> for a "lucene.lockdir" prop before the
> "java.io.tmpdir" prop giving the end user a bit more flexibility in 
> where locks are stored.
> 
> Here is the method:
> 
>   /** Allow flexible locking directories - Michael R. Duval 3/02/04
> */
>   private String getLockDir() {
> String lockDir;
> 
> if ((lockDir = System.getProperty("lucene.lockdir")) == null)
> return System.getProperty("java.io.tmpdir");
> else
> return  lockDir;
>   }
> 
> Hopefully a solution similar to this will make it in to one of the
> next 
> distributions.
> 
> Thanks and Cheers,
> 
> Mike
> 
> -- 
> Michael R. Duval <[EMAIL PROTECTED] >
> E-Journal Programmer/Analyst
> The American Physical Society
> 1 Research Road
> Ridge, NY 11961
> 
> www.aps.org
> 631 591 4127
> 
> 
> 
> -
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> 


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



java.io.tmpdir as lock dir .... once again

2004-03-02 Thread Michael Duval
Hello All,

I've come across my first gotcha with the system property 
"java.io.tmpdir" as the lock directory.

Over here at APS we run lucene in two different servlet containers on 
two different servers for both performance
and security reasons.  One container gives read access to the collection 
and the other is contantly updating the collection.
The collection is NFS mounted from both servers.   This worked fine 
until the lucene update 1.3.   Now the lock files are being
written to the temp dir's in each of the respective containers root 
dir's.   This of course breaks the locking scheme.

I could have changed the tmpdir prop to write files back into the 
collection directory but this would also pollute
the tmpdir with other non-related files.  My solution was as follows:

I've hacked the code for the time being by updating FSDirectory and 
replaced all System.getProperty("java.io.tmpdir")
calls with a call to a new method "getLockDir()".   This method checks 
for a "lucene.lockdir" prop before the
"java.io.tmpdir" prop giving the end user a bit more flexibility in 
where locks are stored.

Here is the method:

 /** Allow flexible locking directories - Michael R. Duval 3/02/04 */
 private String getLockDir() {
   String lockDir;
   if ((lockDir = System.getProperty("lucene.lockdir")) == null)
   return System.getProperty("java.io.tmpdir");
   else
   return  lockDir;
 }
Hopefully a solution similar to this will make it in to one of the next 
distributions.

Thanks and Cheers,

Mike

--
Michael R. Duval <[EMAIL PROTECTED] >
E-Journal Programmer/Analyst
The American Physical Society
1 Research Road
Ridge, NY 11961
www.aps.org
631 591 4127


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]