Re: Sys properties Was: java.io.tmpdir as lock dir .... once again
Thanks, I was going to submit a patch for this feature this week-end. sv On Thu, 18 Mar 2004, Otis Gospodnetic wrote: > I added support for all items listed below, except commit/write lock > file name. I don't see why one would want to change that, considering > those files are still limited to the index directory. > > Otis > > --- Stephane James Vaucher <[EMAIL PROTECTED]> wrote: > > How about (looking big rather than small): > > > > - MaxClause from BooleanQuery (I know there has been discussions on > > the dev list, but I haven't been following it) > > - default commit_lock_name > > - default commit_lock_timeout > > - default maxFieldLength > > - default maxMergeDocs > > - default mergeFactor > > - default minMergeDocs > > - default write_lock_name > > - default write_lock_timeout > > > > I'm currently configuring parts of my app using sys properties, > > particularly the mergeFactor because my prod system has 2GB of RAM > > and is > > windows based and my dev machine has 256MB and is linux. If no one > > takes a > > crack at this, I'll see what I can do in 2 weeks, after my vacations. > > > > Cheers, > > sv > > > > On Wed, 3 Mar 2004, Doug Cutting wrote: > > > > > Stephane James Vaucher wrote: > > > > As I've stated in my earlier mail, I like this change. More > > importantly, > > > > could this become a "standard" way of changing configurations at > > runtime? > > > > For example, the default merge factor could also be set in this > > manner. > > > > > > Sure, that's reasonable, so this would be something like: > > > > > > private static final int DEFAULT_MERGE_FACTOR = > > > > > > > > > Integer.parseInt(System.getProperty("org.apache.lucene.mergeFactor","10")); > > > > > > In IndexWriter.java. > > > > > > What other candidates are there for this treatment? > > > > > > Doug > > > > > > > > - > > > To unsubscribe, e-mail: [EMAIL PROTECTED] > > > For additional commands, e-mail: > > [EMAIL PROTECTED] > > > > > > > > > - > > To unsubscribe, e-mail: [EMAIL PROTECTED] > > For additional commands, e-mail: [EMAIL PROTECTED] > > > > > - > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Sys properties Was: java.io.tmpdir as lock dir .... once again
I added support for all items listed below, except commit/write lock file name. I don't see why one would want to change that, considering those files are still limited to the index directory. Otis --- Stephane James Vaucher <[EMAIL PROTECTED]> wrote: > How about (looking big rather than small): > > - MaxClause from BooleanQuery (I know there has been discussions on > the dev list, but I haven't been following it) > - default commit_lock_name > - default commit_lock_timeout > - default maxFieldLength > - default maxMergeDocs > - default mergeFactor > - default minMergeDocs > - default write_lock_name > - default write_lock_timeout > > I'm currently configuring parts of my app using sys properties, > particularly the mergeFactor because my prod system has 2GB of RAM > and is > windows based and my dev machine has 256MB and is linux. If no one > takes a > crack at this, I'll see what I can do in 2 weeks, after my vacations. > > Cheers, > sv > > On Wed, 3 Mar 2004, Doug Cutting wrote: > > > Stephane James Vaucher wrote: > > > As I've stated in my earlier mail, I like this change. More > importantly, > > > could this become a "standard" way of changing configurations at > runtime? > > > For example, the default merge factor could also be set in this > manner. > > > > Sure, that's reasonable, so this would be something like: > > > > private static final int DEFAULT_MERGE_FACTOR = > > > > > Integer.parseInt(System.getProperty("org.apache.lucene.mergeFactor","10")); > > > > In IndexWriter.java. > > > > What other candidates are there for this treatment? > > > > Doug > > > > > - > > To unsubscribe, e-mail: [EMAIL PROTECTED] > > For additional commands, e-mail: > [EMAIL PROTECTED] > > > > > - > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: Sys properties Was: java.io.tmpdir as lock dir .... once again
(ms) 16 39772 total within(ms) 32 18645 total within(ms) 32 995 total within(ms) 47 6 open index time:47 28367 total within(ms) 15 37970 total within(ms) 31 45169 total within(ms) 46 21168 total within(ms) 31 1112 total within(ms) 31 7 open index time:31 31424 total within(ms) 31 42002 total within(ms) 16 49994 total within(ms) 31 23432 total within(ms) 32 1223 total within(ms) 47 8 open index time:46 33895 total within(ms) 32 45292 total within(ms) 47 53957 total within(ms) 47 25230 total within(ms) 32 1352 total within(ms) 47 9 open index time:63 37320 total within(ms) 31 49922 total within(ms) 15 59412 total within(ms) 47 27830 total within(ms) 31 1474 total within(ms) 62 10 open index time:984 38475 total within(ms) 16 51552 total within(ms) 16 61389 total within(ms) 47 28638 total within(ms) 46 1530 total within(ms) 157 -Original Message- From: Doug Cutting [mailto:[EMAIL PROTECTED] Sent: Monday, March 08, 2004 1:16 PM To: Lucene Users List Subject: Re: Sys properties Was: java.io.tmpdir as lock dir once again hui wrote: > Index time: > compound format is 89 seconds slower. > > compound format: > 1389507 total milliseconds > non-compound format: > 1300534 total milliseconds > > The index size is 85m with 4 fields only. The files are stored in the index. > The compound format has only 3 files and the other has 13 files. Thanks for performing this benchmark! It looks like the compound format is around 7% slower when indexing. To my thinking that's acceptable, given the dramatic reduction in file handles. If folks really need maximal indexing performance, then they can explicitly disable the compound format. Would anyone object to making compound format the default for Lucene 1.4? This is an incompatible change, but I don't think it should break applications. Doug - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Sys properties Was: java.io.tmpdir as lock dir .... once again
I tend to agree (but with the same uncertainty as to why I feel that way). Regards, Terry - Original Message - From: "Otis Gospodnetic" <[EMAIL PROTECTED]> To: "Lucene Users List" <[EMAIL PROTECTED]> Sent: Monday, March 08, 2004 2:34 PM Subject: Re: Sys properties Was: java.io.tmpdir as lock dir once again > I can't explain why, but I feel like the old index format should stay > by default. I feel like I'd rather a (slightly) faster index, and > switch to the compound one when/IF I encounter problems, than have a > safer, but slower index, and never realize that there is a faster > option available. > > Weak argument, I know, but some instinct in me thinks that the current > mode should remain. > > Otis > > > --- Doug Cutting <[EMAIL PROTECTED]> wrote: > > hui wrote: > > > Index time: > > > compound format is 89 seconds slower. > > > > > > compound format: > > > 1389507 total milliseconds > > > non-compound format: > > > 1300534 total milliseconds > > > > > > The index size is 85m with 4 fields only. The files are stored in > > the index. > > > The compound format has only 3 files and the other has 13 files. > > > > Thanks for performing this benchmark! > > > > It looks like the compound format is around 7% slower when indexing. > > To > > my thinking that's acceptable, given the dramatic reduction in file > > handles. If folks really need maximal indexing performance, then > > they > > can explicitly disable the compound format. > > > > Would anyone object to making compound format the default for Lucene > > 1.4? This is an incompatible change, but I don't think it should > > break > > applications. > > > > Doug > > > > - > > To unsubscribe, e-mail: [EMAIL PROTECTED] > > For additional commands, e-mail: [EMAIL PROTECTED] > > > > > - > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Sys properties Was: java.io.tmpdir as lock dir .... once again
I can't explain why, but I feel like the old index format should stay by default. I feel like I'd rather a (slightly) faster index, and switch to the compound one when/IF I encounter problems, than have a safer, but slower index, and never realize that there is a faster option available. Weak argument, I know, but some instinct in me thinks that the current mode should remain. Otis --- Doug Cutting <[EMAIL PROTECTED]> wrote: > hui wrote: > > Index time: > > compound format is 89 seconds slower. > > > > compound format: > > 1389507 total milliseconds > > non-compound format: > > 1300534 total milliseconds > > > > The index size is 85m with 4 fields only. The files are stored in > the index. > > The compound format has only 3 files and the other has 13 files. > > Thanks for performing this benchmark! > > It looks like the compound format is around 7% slower when indexing. > To > my thinking that's acceptable, given the dramatic reduction in file > handles. If folks really need maximal indexing performance, then > they > can explicitly disable the compound format. > > Would anyone object to making compound format the default for Lucene > 1.4? This is an incompatible change, but I don't think it should > break > applications. > > Doug > > - > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Sys properties Was: java.io.tmpdir as lock dir .... once again
hui wrote: Index time: compound format is 89 seconds slower. compound format: 1389507 total milliseconds non-compound format: 1300534 total milliseconds The index size is 85m with 4 fields only. The files are stored in the index. The compound format has only 3 files and the other has 13 files. Thanks for performing this benchmark! It looks like the compound format is around 7% slower when indexing. To my thinking that's acceptable, given the dramatic reduction in file handles. If folks really need maximal indexing performance, then they can explicitly disable the compound format. Would anyone object to making compound format the default for Lucene 1.4? This is an incompatible change, but I don't think it should break applications. Doug - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: Sys properties Was: java.io.tmpdir as lock dir .... once again
Thank you, the converting option from Luke is really helpful for migrate existing user index. Regards, Hui -Original Message- From: Andrzej Bialecki [mailto:[EMAIL PROTECTED] Sent: Monday, March 08, 2004 10:57 AM To: Lucene Users List Subject: Re: Sys properties Was: java.io.tmpdir as lock dir once again hui wrote: > > > > Hi, > > Here is the indexing performance testing result for the two index formats. A shameless plug: you can use Luke (http://www.getopt.org/luke) to convert the same index between compound/non-compound formats. Which could be useful to rule out any possible differences in the indexing/inserting process between the runs. Luke provides you also with a simple time measurement for query execution. Just FYI. -- Best regards, Andrzej Bialecki - Software Architect, System Integration Specialist CEN/ISSS EC Workshop, ECIMF project chair EU FP6 E-Commerce Expert/Evaluator - FreeBSD developer (http://www.freebsd.org) - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Sys properties Was: java.io.tmpdir as lock dir .... once again
hui wrote: Hi, Here is the indexing performance testing result for the two index formats. A shameless plug: you can use Luke (http://www.getopt.org/luke) to convert the same index between compound/non-compound formats. Which could be useful to rule out any possible differences in the indexing/inserting process between the runs. Luke provides you also with a simple time measurement for query execution. Just FYI. -- Best regards, Andrzej Bialecki - Software Architect, System Integration Specialist CEN/ISSS EC Workshop, ECIMF project chair EU FP6 E-Commerce Expert/Evaluator - FreeBSD developer (http://www.freebsd.org) - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: Sys properties Was: java.io.tmpdir as lock dir .... once again
Hi, Here is the indexing performance testing result for the two index formats. 1000 megahertz Intel Pentium III (2 installed) 32 kilobyte primary memory cache 256 kilobyte secondary memory cache SCSI Hard drive 145.45 GB RAm 3G Windows 2000 Advanced Server, Service Pack 2 JDK 140 JVM memory 512m Indexed files: local 66100 local text files around 400m Index time: compound format is 89 seconds slower. compound format: 1389507 total milliseconds non-compound format: 1300534 total milliseconds The index size is 85m with 4 fields only. The files are stored in the index. The compound format has only 3 files and the other has 13 files. Search Time (with only top 10 retrieved, no indexing at the same time, only one thread search, indices are optimized and opened) Do not see too much constant difference for the simple situation. compound format: Query: iraq 4275 total within(ms) 110 Query: war 5728 total within(ms) 0 Query: iraq AND war 3182 total within(ms) 16 non-compound format: Query: war 5728 total within(ms) 125 Query: iraq war 6821 total within(ms) 31 Query: iraq AND war 3182 total within(ms) 0 -Original Message- From: Doug Cutting [mailto:[EMAIL PROTECTED] Sent: Thursday, March 04, 2004 11:54 AM To: Lucene Users List Subject: Re: Sys properties Was: java.io.tmpdir as lock dir once again hui wrote: > Not yet. For the compound file format, when the files get bigger, if I add > few new files frequently, the bigger files has to be updated. Will that > affect lot on the search and produce heavier disk I/O compared with the > traditional index format? It seems OS cache makes quite difference when the > files not changed differently. The compound format slows indexing performance slightly, but should not affect search performance much. It radically reduces the number of file handles used when searching, by a factor of eight or more, depending on how many indexed fields you have. Perhaps the compound format should be the default format in 1.4. Can folks provide any benchmarks for how it affects performance? Doug - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Sys properties Was: java.io.tmpdir as lock dir .... once again
hui wrote: Not yet. For the compound file format, when the files get bigger, if I add few new files frequently, the bigger files has to be updated. Will that affect lot on the search and produce heavier disk I/O compared with the traditional index format? It seems OS cache makes quite difference when the files not changed differently. The compound format slows indexing performance slightly, but should not affect search performance much. It radically reduces the number of file handles used when searching, by a factor of eight or more, depending on how many indexed fields you have. Perhaps the compound format should be the default format in 1.4. Can folks provide any benchmarks for how it affects performance? Doug - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: Sys properties Was: java.io.tmpdir as lock dir .... once again
Not yet. For the compound file format, when the files get bigger, if I add few new files frequently, the bigger files has to be updated. Will that affect lot on the search and produce heavier disk I/O compared with the traditional index format? It seems OS cache makes quite difference when the files not changed differently. Regards, Hui -Original Message- From: Erik Hatcher [mailto:[EMAIL PROTECTED] Sent: Wednesday, March 03, 2004 9:21 PM To: Lucene Users List Subject: Re: Sys properties Was: java.io.tmpdir as lock dir once again On Mar 3, 2004, at 4:25 PM, hui wrote: > Anoterh similar issue. If we could have a parameter to control the max > number of the files within the index, that is going to avoid the > problem of > running of the file handler issue. > When the file number within one index reaches the limit, optimization > is > going to be called. > Right now, if the file number within one index out of the limit of your > window system, you lost the index. > Thank you for the consideration. Have you tried using the compound file format introduced in 1.3? - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Sys properties Was: java.io.tmpdir as lock dir .... once again
On Mar 3, 2004, at 4:25 PM, hui wrote: Anoterh similar issue. If we could have a parameter to control the max number of the files within the index, that is going to avoid the problem of running of the file handler issue. When the file number within one index reaches the limit, optimization is going to be called. Right now, if the file number within one index out of the limit of your window system, you lost the index. Thank you for the consideration. Have you tried using the compound file format introduced in 1.3? - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: Sys properties Was: java.io.tmpdir as lock dir .... once again
Anoterh similar issue. If we could have a parameter to control the max number of the files within the index, that is going to avoid the problem of running of the file handler issue. When the file number within one index reaches the limit, optimization is going to be called. Right now, if the file number within one index out of the limit of your window system, you lost the index. Thank you for the consideration. Regards, hui -Original Message- From: Doug Cutting [mailto:[EMAIL PROTECTED] Sent: Wednesday, March 03, 2004 3:46 PM To: Lucene Users List Subject: Re: Sys properties Was: java.io.tmpdir as lock dir once again Stephane James Vaucher wrote: > As I've stated in my earlier mail, I like this change. More importantly, > could this become a "standard" way of changing configurations at runtime? > For example, the default merge factor could also be set in this manner. Sure, that's reasonable, so this would be something like: private static final int DEFAULT_MERGE_FACTOR = Integer.parseInt(System.getProperty("org.apache.lucene.mergeFactor","10")); In IndexWriter.java. What other candidates are there for this treatment? Doug - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Sys properties Was: java.io.tmpdir as lock dir .... once again
How about (looking big rather than small): - MaxClause from BooleanQuery (I know there has been discussions on the dev list, but I haven't been following it) - default commit_lock_name - default commit_lock_timeout - default maxFieldLength - default maxMergeDocs - default mergeFactor - default minMergeDocs - default write_lock_name - default write_lock_timeout I'm currently configuring parts of my app using sys properties, particularly the mergeFactor because my prod system has 2GB of RAM and is windows based and my dev machine has 256MB and is linux. If no one takes a crack at this, I'll see what I can do in 2 weeks, after my vacations. Cheers, sv On Wed, 3 Mar 2004, Doug Cutting wrote: > Stephane James Vaucher wrote: > > As I've stated in my earlier mail, I like this change. More importantly, > > could this become a "standard" way of changing configurations at runtime? > > For example, the default merge factor could also be set in this manner. > > Sure, that's reasonable, so this would be something like: > > private static final int DEFAULT_MERGE_FACTOR = > > Integer.parseInt(System.getProperty("org.apache.lucene.mergeFactor","10")); > > In IndexWriter.java. > > What other candidates are there for this treatment? > > Doug > > - > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Sys properties Was: java.io.tmpdir as lock dir .... once again
Stephane James Vaucher wrote: As I've stated in my earlier mail, I like this change. More importantly, could this become a "standard" way of changing configurations at runtime? For example, the default merge factor could also be set in this manner. Sure, that's reasonable, so this would be something like: private static final int DEFAULT_MERGE_FACTOR = Integer.parseInt(System.getProperty("org.apache.lucene.mergeFactor","10")); In IndexWriter.java. What other candidates are there for this treatment? Doug - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Sys properties Was: java.io.tmpdir as lock dir .... once again
As I've stated in my earlier mail, I like this change. More importantly, could this become a "standard" way of changing configurations at runtime? For example, the default merge factor could also be set in this manner. sv On Wed, 3 Mar 2004, Michael Duval wrote: > > I agree with both the property name change and also making it static. > > Mike > > Doug Cutting wrote: > > > Michael Duval wrote: > > > I've hacked the code for the time being by updating FSDirectory and > > > >> replaced all System.getProperty("java.io.tmpdir") > >> calls with a call to a new method "getLockDir()". This method > >> checks for a "lucene.lockdir" prop before the > >> "java.io.tmpdir" prop giving the end user a bit more flexibility in > >> where locks are stored. > > > > > > In general, I support this change. > > > >> Here is the method: > >> > >> /** Allow flexible locking directories - Michael R. Duval 3/02/04 */ > >> private String getLockDir() { > >>String lockDir; > >> > >>if ((lockDir = System.getProperty("lucene.lockdir")) == null) > >>return System.getProperty("java.io.tmpdir"); > >>else > >>return lockDir; > >> } > > > > > > In particular, I have some quibbles. The property should be named > > something like "org.apache.lucene.lockdir", not just "lucene.lockdir". > > And there's no reason to look it up each time: it can just be a static. > > > > private static final String LOCK_DIR = > > System.getProperty("org.apache.lucene.lockdir", > > System.getProperty("java.io.tmpdir")); > > > > Doug > > > > - > > To unsubscribe, e-mail: [EMAIL PROTECTED] > > For additional commands, e-mail: [EMAIL PROTECTED] > > > > > > - > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: java.io.tmpdir as lock dir .... once again
I agree with both the property name change and also making it static. Mike Doug Cutting wrote: Michael Duval wrote: > I've hacked the code for the time being by updating FSDirectory and replaced all System.getProperty("java.io.tmpdir") calls with a call to a new method "getLockDir()". This method checks for a "lucene.lockdir" prop before the "java.io.tmpdir" prop giving the end user a bit more flexibility in where locks are stored. In general, I support this change. Here is the method: /** Allow flexible locking directories - Michael R. Duval 3/02/04 */ private String getLockDir() { String lockDir; if ((lockDir = System.getProperty("lucene.lockdir")) == null) return System.getProperty("java.io.tmpdir"); else return lockDir; } In particular, I have some quibbles. The property should be named something like "org.apache.lucene.lockdir", not just "lucene.lockdir". And there's no reason to look it up each time: it can just be a static. private static final String LOCK_DIR = System.getProperty("org.apache.lucene.lockdir", System.getProperty("java.io.tmpdir")); Doug - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: java.io.tmpdir as lock dir .... once again
Michael Duval wrote: > I've hacked the code for the time being by updating FSDirectory and replaced all System.getProperty("java.io.tmpdir") calls with a call to a new method "getLockDir()". This method checks for a "lucene.lockdir" prop before the "java.io.tmpdir" prop giving the end user a bit more flexibility in where locks are stored. In general, I support this change. Here is the method: /** Allow flexible locking directories - Michael R. Duval 3/02/04 */ private String getLockDir() { String lockDir; if ((lockDir = System.getProperty("lucene.lockdir")) == null) return System.getProperty("java.io.tmpdir"); else return lockDir; } In particular, I have some quibbles. The property should be named something like "org.apache.lucene.lockdir", not just "lucene.lockdir". And there's no reason to look it up each time: it can just be a static. private static final String LOCK_DIR = System.getProperty("org.apache.lucene.lockdir", System.getProperty("java.io.tmpdir")); Doug - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: java.io.tmpdir as lock dir .... once again
Otis Gospodnetic writes: > This looks nice. > However, what happens if you have two Java processes that work on the > same index, and give it different lock directories? > They'll mess up the index. > Is that different to having two java processes using different java.io.tempdir? I had that problem (one running in tomcat and one from the command line). I don't think that making the need to choose the same directory for the lock more explicit will increase the problems. Morus - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: java.io.tmpdir as lock dir .... once again
I had to do something similar to make the application works with lucene 1.3 final when upgrading from 1.3 RC1. I think it is better to maintain back compatiable so existing users are not affected too much when a new release is available. I'd like to "me too" this sentiment. That change caused me a little churn. =Matt - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: java.io.tmpdir as lock dir .... once again
I had to do something similar to make the application works with lucene 1.3 final when upgrading from 1.3 RC1. I think it is better to maintain back compatiable so existing users are not affected too much when a new release is available. Regards, Hui -Original Message- From: Michael Duval [mailto:[EMAIL PROTECTED] Sent: Tue 3/2/2004 4:11 PM To: [EMAIL PROTECTED] Cc: Subject: java.io.tmpdir as lock dir once again Hello All, I've come across my first gotcha with the system property "java.io.tmpdir" as the lock directory. Over here at APS we run lucene in two different servlet containers on two different servers for both performance and security reasons. One container gives read access to the collection and the other is contantly updating the collection. The collection is NFS mounted from both servers. This worked fine until the lucene update 1.3. Now the lock files are being written to the temp dir's in each of the respective containers root dir's. This of course breaks the locking scheme. I could have changed the tmpdir prop to write files back into the collection directory but this would also pollute the tmpdir with other non-related files. My solution was as follows: I've hacked the code for the time being by updating FSDirectory and replaced all System.getProperty("java.io.tmpdir") calls with a call to a new method "getLockDir()". This method checks for a "lucene.lockdir" prop before the "java.io.tmpdir" prop giving the end user a bit more flexibility in where locks are stored. Here is the method: /** Allow flexible locking directories - Michael R. Duval 3/02/04 */ private String getLockDir() { String lockDir; if ((lockDir = System.getProperty("lucene.lockdir")) == null) return System.getProperty("java.io.tmpdir"); else return lockDir; } Hopefully a solution similar to this will make it in to one of the next distributions. Thanks and Cheers, Mike -- Michael R. Duval <[EMAIL PROTECTED] > E-Journal Programmer/Analyst The American Physical Society 1 Research Road Ridge, NY 11961 www.aps.org 631 591 4127 - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: java.io.tmpdir as lock dir .... once again
I've done something similar to configure my merge factor (but it was outside my code), and am planning on setting the limit on boolean queries this way as well. I think it's pretty clean especially if you use org.apache.lucene.xxx properties with decent default values. Adding this feature could probably better document the hazards of use an index lock in a distributed system, considering many people like to know the implications of running lucene in a web (and potentially replicated) env. my 2c, sv On Tue, 2 Mar 2004, Otis Gospodnetic wrote: > This looks nice. > However, what happens if you have two Java processes that work on the > same index, and give it different lock directories? > They'll mess up the index. If you sell people coffee, they can always burn themselves. Might as well warn them. > Should we try to prevent this by not offering this option, or should we > offer it, document it well, and leave it up to the user to play by the > rules or not? > > I'm leaning towards the latter, but I think some Lucene developers > would be more conservative. > > Otis > > > --- Michael Duval <[EMAIL PROTECTED]> wrote: > > > > Hello All, > > > > I've come across my first gotcha with the system property > > "java.io.tmpdir" as the lock directory. > > > > Over here at APS we run lucene in two different servlet containers on > > > > two different servers for both performance > > and security reasons. One container gives read access to the > > collection > > and the other is contantly updating the collection. > > The collection is NFS mounted from both servers. This worked fine > > until the lucene update 1.3. Now the lock files are being > > written to the temp dir's in each of the respective containers root > > dir's. This of course breaks the locking scheme. > > > > I could have changed the tmpdir prop to write files back into the > > collection directory but this would also pollute > > the tmpdir with other non-related files. My solution was as follows: > > > > I've hacked the code for the time being by updating FSDirectory and > > replaced all System.getProperty("java.io.tmpdir") > > calls with a call to a new method "getLockDir()". This method > > checks > > for a "lucene.lockdir" prop before the > > "java.io.tmpdir" prop giving the end user a bit more flexibility in > > where locks are stored. > > > > Here is the method: > > > > /** Allow flexible locking directories - Michael R. Duval 3/02/04 > > */ > > private String getLockDir() { > > String lockDir; > > > > if ((lockDir = System.getProperty("lucene.lockdir")) == null) > > return System.getProperty("java.io.tmpdir"); > > else > > return lockDir; > > } > > > > Hopefully a solution similar to this will make it in to one of the > > next > > distributions. > > > > Thanks and Cheers, > > > > Mike > > > > -- > > Michael R. Duval <[EMAIL PROTECTED] > > > E-Journal Programmer/Analyst > > The American Physical Society > > 1 Research Road > > Ridge, NY 11961 > > > > www.aps.org > > 631 591 4127 > > > > > > > > - > > To unsubscribe, e-mail: [EMAIL PROTECTED] > > For additional commands, e-mail: [EMAIL PROTECTED] > > > > > - > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: java.io.tmpdir as lock dir .... once again
This looks nice. However, what happens if you have two Java processes that work on the same index, and give it different lock directories? They'll mess up the index. Should we try to prevent this by not offering this option, or should we offer it, document it well, and leave it up to the user to play by the rules or not? I'm leaning towards the latter, but I think some Lucene developers would be more conservative. Otis --- Michael Duval <[EMAIL PROTECTED]> wrote: > > Hello All, > > I've come across my first gotcha with the system property > "java.io.tmpdir" as the lock directory. > > Over here at APS we run lucene in two different servlet containers on > > two different servers for both performance > and security reasons. One container gives read access to the > collection > and the other is contantly updating the collection. > The collection is NFS mounted from both servers. This worked fine > until the lucene update 1.3. Now the lock files are being > written to the temp dir's in each of the respective containers root > dir's. This of course breaks the locking scheme. > > I could have changed the tmpdir prop to write files back into the > collection directory but this would also pollute > the tmpdir with other non-related files. My solution was as follows: > > I've hacked the code for the time being by updating FSDirectory and > replaced all System.getProperty("java.io.tmpdir") > calls with a call to a new method "getLockDir()". This method > checks > for a "lucene.lockdir" prop before the > "java.io.tmpdir" prop giving the end user a bit more flexibility in > where locks are stored. > > Here is the method: > > /** Allow flexible locking directories - Michael R. Duval 3/02/04 > */ > private String getLockDir() { > String lockDir; > > if ((lockDir = System.getProperty("lucene.lockdir")) == null) > return System.getProperty("java.io.tmpdir"); > else > return lockDir; > } > > Hopefully a solution similar to this will make it in to one of the > next > distributions. > > Thanks and Cheers, > > Mike > > -- > Michael R. Duval <[EMAIL PROTECTED] > > E-Journal Programmer/Analyst > The American Physical Society > 1 Research Road > Ridge, NY 11961 > > www.aps.org > 631 591 4127 > > > > - > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
java.io.tmpdir as lock dir .... once again
Hello All, I've come across my first gotcha with the system property "java.io.tmpdir" as the lock directory. Over here at APS we run lucene in two different servlet containers on two different servers for both performance and security reasons. One container gives read access to the collection and the other is contantly updating the collection. The collection is NFS mounted from both servers. This worked fine until the lucene update 1.3. Now the lock files are being written to the temp dir's in each of the respective containers root dir's. This of course breaks the locking scheme. I could have changed the tmpdir prop to write files back into the collection directory but this would also pollute the tmpdir with other non-related files. My solution was as follows: I've hacked the code for the time being by updating FSDirectory and replaced all System.getProperty("java.io.tmpdir") calls with a call to a new method "getLockDir()". This method checks for a "lucene.lockdir" prop before the "java.io.tmpdir" prop giving the end user a bit more flexibility in where locks are stored. Here is the method: /** Allow flexible locking directories - Michael R. Duval 3/02/04 */ private String getLockDir() { String lockDir; if ((lockDir = System.getProperty("lucene.lockdir")) == null) return System.getProperty("java.io.tmpdir"); else return lockDir; } Hopefully a solution similar to this will make it in to one of the next distributions. Thanks and Cheers, Mike -- Michael R. Duval <[EMAIL PROTECTED] > E-Journal Programmer/Analyst The American Physical Society 1 Research Road Ridge, NY 11961 www.aps.org 631 591 4127 - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]