[jira] Updated: (LUCENE-443) ConjunctionScorer tune-up

2006-09-21 Thread Paul Elschot (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-443?page=all ] Paul Elschot updated LUCENE-443: Attachment: Conjunction20060921.patch Iirc the orginal performance problem was caused by creation of objects in the tight loop doing skipTo() on al the scorer

[jira] Commented: (LUCENE-443) ConjunctionScorer tune-up

2006-09-21 Thread Paul Elschot (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-443?page=comments#action_12436453 ] Paul Elschot commented on LUCENE-443: - I just overlooked the grant by Abdul to the ASF. > ConjunctionScorer tune-up > - > >

Re: Clustering IndexWriter?

2006-09-21 Thread adasal
Don't be coy, what's your comapany? Adam On 21/09/06, Steve Harris <[EMAIL PROTECTED]> wrote: Warning, I'm a vendor dude but this isn't really a vendor message. My IT guy had mentioned to me that a bunch of the open source products we use (JIRA, JForum etc) have Lucene inside and in the name o

[jira] Commented: (LUCENE-675) Lucene benchmark: objective performance test for Lucene

2006-09-21 Thread Karl Wettin (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-675?page=comments#action_12436502 ] Karl Wettin commented on LUCENE-675: It is also interesting to know how much time is consumed to assemble an instance of Document from the storage. According t

Re: Clustering IndexWriter?

2006-09-21 Thread Vic Bancroft
adasal wrote: Don't be coy, what's your comapany? This URL is derivable from the text, with a little search ening help . . . ** http://www.terracottatech.com/terracotta_spring.shtml more, l8r, v On 21/09/06, Steve Harris <[EMAIL PROTECTED]> wrote: Warning, I'm a vendor dude but this isn

[jira] Commented: (LUCENE-675) Lucene benchmark: objective performance test for Lucene

2006-09-21 Thread Grant Ingersoll (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-675?page=comments#action_12436516 ] Grant Ingersoll commented on LUCENE-675: Since this has dependencies, do you think we should put it under contrib? I would be for a Performance directory

[jira] Commented: (LUCENE-675) Lucene benchmark: objective performance test for Lucene

2006-09-21 Thread Andrzej Bialecki (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-675?page=comments#action_12436518 ] Andrzej Bialecki commented on LUCENE-675: -- The dependency on commons-compress could be avoided - I used this just to be able to unpack tar.gz files, we c

[jira] Commented: (LUCENE-675) Lucene benchmark: objective performance test for Lucene

2006-09-21 Thread Grant Ingersoll (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-675?page=comments#action_12436519 ] Grant Ingersoll commented on LUCENE-675: Yeah, ANT can do this, I think. Take a look at the DB contrib package, it downloads. I think I can setup the nec

Re: Re: Clustering IndexWriter?

2006-09-21 Thread Steve Harris
Yep, that's us. No secret, just didn't want to make my question an billboard :-). Just needed a bit of info from the people who know best. Cheers, steve On 9/21/06, Vic Bancroft <[EMAIL PROTECTED]> wrote: adasal wrote: > Don't be coy, what's your comapany? This URL is derivable from the text,

Re: Clustering IndexWriter?

2006-09-21 Thread Yonik Seeley
On 9/20/06, Steve Harris <[EMAIL PROTECTED]> wrote: Is clustering the IndexWriter really all I need to do? Hi Steve, Could you explain the details of what "clustering" really means in this context? -Yonik http://incubator.apache.org/solr Solr, the open-source Lucene search server

Re: Re: Clustering IndexWriter?

2006-09-21 Thread Steve Harris
Sure, I'm fairly new to Lucene but what I was trying to do was make it so that an index could be shared among multiple nodes. If an index is updated in any way it would be updated across the cluster coherently. In my first version I was really only taking advantage of the fact that we detect fine

Re: Re: Clustering IndexWriter?

2006-09-21 Thread Otis Gospodnetic
I don't fully follow, and I don't even have the "it's late!" excuse. It sounds like you want to have the same index on multiple nodes in the cluster and when a data change occurs, you want to synchronously make the same change to all indices in your cluster. Is that it? Solr has a different a

[jira] Updated: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)

2006-09-21 Thread Ning Li (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-565?page=all ] Ning Li updated LUCENE-565: --- Attachment: NewIndexModifier.Sept21.patch This is to update the delete-support patch after the commit of the new merge policy. - Very few changes to IndexWriter. - T

[jira] Commented: (LUCENE-675) Lucene benchmark: objective performance test for Lucene

2006-09-21 Thread Otis Gospodnetic (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-675?page=comments#action_12436587 ] Otis Gospodnetic commented on LUCENE-675: - I still haven't gotten my employer to sign and fax the CCLA, so I'm stuck and can't contribute my search benchma

Re: [jira] Updated: (LUCENE-665) temporary file access denied on Windows

2006-09-21 Thread Chris Hostetter
The recurring pattern seems to be... ResultType methodName(ArgType args) throws ExceptionType { int trialsSoFar = 0; long maxTime = System.currentTimeMillis() + maxTotalDelay; Exception error = null; while (waitAgain(maxTime, trialsSoFar++, error)) { try { return s

Re: Re: Clustering IndexWriter?

2006-09-21 Thread Chris Hostetter
: Questions: : Is this useful in the real world : Would it be possible to get that one small thing changed. I'm not really clear on what the "small thing" is that you are asking about ... you mentioned SegmentInfos subclassing Vector, are you proposing an alternative? If you've got a patch that

help on Lock.obtain(lockWaitTimeout)

2006-09-21 Thread Michael McCandless
I'm working on a LockFactory that uses java.nio.* (OS native locks) for its locks. This should be a big help for people who keep finding their lock files left on disk due to abnormal shutdown, etc (because OS will free the locks, nomatter what, "in theory"). I thought I was nearly done but

Re: Re: Re: Clustering IndexWriter?

2006-09-21 Thread Steve Harris
Fair question. All I did/need was take SegmentInfos and instead of subclassing Vector I made it contain a Vector. Went from subclassing to aggregation. As far as I could tell from reading the code it would make no difference to anyone and should have no performance impact (good or bad). It just a

Re: [jira] Updated: (LUCENE-665) temporary file access denied on Windows

2006-09-21 Thread Doron Cohen
Thanks for the comments! Indeed the first version I wrote followed the pattern you suggest (let's name it pattern_1 for the discussion). However with pattern_1 I could not cover the case of a method originally not throwing an exception. The problem is that in pattern_1 we have to catch the excepti

Re: help on Lock.obtain(lockWaitTimeout)

2006-09-21 Thread Yonik Seeley
On 9/21/06, Michael McCandless <[EMAIL PROTECTED]> wrote: Anyway, my first reaction was to change this to use System.currentTimeMillis() to measure elapsed time, but then I remembered is a dangerous approach because whenever the clock on the machine is updated (eg by a time-sync NTP client) it wo

Re: Re: Re: Clustering IndexWriter?

2006-09-21 Thread Yonik Seeley
While automatically clustering java objects sure sounds cool, I have to wonder what the performance ends up being. Every small change to the clustered objects is broadcast to all the nodes, correct? Have you done any performance comparisons to see if this is a practical approach for Lucene? -Yo

Re: Re: Re: Re: Clustering IndexWriter?

2006-09-21 Thread Steve Harris
Good question. May or may not be performant enough. Only time (and testing) will tell. My guess is that it will depend heavily on the rate in which the data changes (or read write ratio). Believe me, I'm not proposing that everyone go out and cluster lucene with terracotta dso. I'm really just pl

Re: Re: Re: Re: Clustering IndexWriter?

2006-09-21 Thread Yonik Seeley
On 9/21/06, Steve Harris <[EMAIL PROTECTED]> wrote: My guess is that some segment of the world cares a lot about realtime coherent updates and some segment of the world needs blinding speed. Part of my research is to gather the expertise of this group on these issues. I hear ya... There is ano

Re: Re: Re: Re: Re: Clustering IndexWriter?

2006-09-21 Thread Steve Harris
Interesting. I wonder, I have a notification mechanism at my disposal as well. I wonder if it could be worked out that, much like a mvc, an IndexReader could be notified when the underlying Directory has changed so that the reader can adjust itself? Cheers, Steve On 9/21/06, Yonik Seeley <[EMAI

Re: help on Lock.obtain(lockWaitTimeout)

2006-09-21 Thread Doron Cohen
For obtain(timeout), to prevent waiting too long you could compute the maximum number of times that obtain() can be executed (assuming, as in current code, that obtain() executes in no time). Then break if either it was executed sufficiently many times or if time is up. I don't see how to prevent w

Re: Re: Re: Re: Re: Clustering IndexWriter?

2006-09-21 Thread Yonik Seeley
On 9/21/06, Steve Harris <[EMAIL PROTECTED]> wrote: Interesting. I wonder, I have a notification mechanism at my disposal as well. I wonder if it could be worked out that, much like a mvc, an IndexReader could be notified when the underlying Directory has changed so that the reader can adjust its

Re: Re: Re: Re: Re: Re: Clustering IndexWriter?

2006-09-21 Thread Steve Harris
I'm don't know list servers rules but I figured I would just include the text of the file I changed. If that is bad form give me a heads up and I won't do it again. Would this change break anything or bother anyone? package org.apache.lucene.index; /** * Copyright 2004 The Apache Software Found

Distributed Indexes, Searches and HDFS

2006-09-21 Thread Chris D
Hi List, As a bit of an experiment I'm redoing some of our indexing and searching code to try to make it easier to manage and distributed. The system has to modify its indexes frequently, sometimes in huge batches, and the documents in the indexes are frequently modified (deleted, modified and re

Re: Re: Re: Re: Re: Re: Re: Clustering IndexWriter?

2006-09-21 Thread Steve Harris
Oops, I made a change and didn't test it. Doh, This should work better: package org.apache.lucene.index; /** * Copyright 2004 The Apache Software Foundation * * Licensed under the Apache License, Version 2.0 (the "License"); * you may not use this file except in compliance with the License. * Yo

Re: Re: Re: Re: Re: Re: Clustering IndexWriter?

2006-09-21 Thread Steve Harris
So I clustered this app: So I switched to clustering the RAMDirectory instead of the IndexWriter and it worked in my experiments. What I did was create a new IndexWriter on Document Adds and a new IndexSearcher on document queries. What I want to know is. How non-standard is this approach? Chee

Re: Distributed Indexes, Searches and HDFS

2006-09-21 Thread Yonik Seeley
On 9/21/06, Chris D <[EMAIL PROTECTED]> wrote: The cronjob/link solution which is quite clean, doesn't work well in a windows environment. While it's my favorite, no dice... Rats. There may be hope yet for that on Windows. Hard links work on Windows, but the only problem is that you can't renam