Re: [jira] Commented: (LUCENE-1519) Change Primitive Data Types from int to long in class SegmentMerger.java

2009-01-13 Thread robert engels
Fairly certain it belongs in the dev-list because it is a bug... The index length is a long, but the left size will be truncated in int math before it is converted to a long and compared. On Jan 13, 2009, at 11:36 PM, Otis Gospodnetic (JIRA) wrote: [ https://issues.apache.org/jira/brow

Re: Filesystem based bitset

2009-01-10 Thread robert engels
possible - different people have different capacities. And if the shoe is on the other foot, it wastes my time, unless the person being talked to has demonstrated a willingness and ability to learn. You can't get blood from a turnip. On Jan 10, 2009, at 11:57 AM, robert engels wrote:

Re: Filesystem based bitset

2009-01-10 Thread robert engels
to either explain everything 3 times, or redo his crap all the time. In fact, I would work hard to see that the latter did not work there very long. On Jan 10, 2009, at 7:53 AM, Grant Ingersoll wrote: On Jan 9, 2009, at 8:06 PM, robert engels wrote: Luckily there are entrepreneurs and othe

Re: [jira] Commented: (LUCENE-1345) Allow Filter as clause to BooleanQuery

2009-01-10 Thread robert engels
There are many expensive to evaluate queries. If you move the deletions check to a clause, then these will be evaluated on deleted documents. For example, we have a query that inspects stored fields, augmented first by an indexed term - basically specialized phrase matching, so a user can

Re: Filesystem based bitset

2009-01-09 Thread robert engels
It was not ad hominem. It was a indirect critique of the value of the answer provided. Ad hominem would be if I called him ugly. On Jan 9, 2009, at 6:34 PM, Doug Cutting wrote: robert engels wrote: Can something be offensive if its a statement of fact ? If you believe it is (under

Re: Filesystem based bitset

2009-01-09 Thread robert engels
who are forcing you to use such a feeble-minded project that your approach would work better for them. I've even created a space for you on Google-Code for you to show them:- http://code.google.com/p/roberts-search/ Sincerely Ian. robert engels wrote: I have better things to do than

Re: Filesystem based bitset

2009-01-09 Thread robert engels
Can something be offensive if its a statement of fact ?  If you believe it is (under definition #3), then his remarks to me were just as offensive - as they caused me much displeasure and resentment. So please dress him down as well.Main Entry: 1of·fen·sive  Pronunciation: \ə-ˈfen(t)-siv, especial

Re: Filesystem based bitset

2009-01-09 Thread robert engels
3:42:35PM -0600, robert engels wrote: If your index can fit in the IO cache, you should using a completely  different implementation...You should be writing a sequential transaction log for add/update/ delete operations, and storing the entire index in memory  (RAMDirectory) - with periodic background flu

Re: Filesystem based bitset

2009-01-09 Thread robert engels
If your index can fit in the IO cache, you should using a completely different implementation... You should be writing a sequential transaction log for add/update/ delete operations, and storing the entire index in memory (RAMDirectory) - with periodic background flushes of the log. If you

Re: [jira] Commented: (LUCENE-1482) Replace infoSteram by a logging framework (SLF4J)

2009-01-09 Thread robert engels
Mangar wrote: On Sat, Jan 10, 2009 at 12:41 AM, robert engels wrote: This is not really true these days. Dynamic class instrumentation/ byte modification can remove the calls entirely (for loggers not enabled). They can be enabled during startup (or a reload from a different class loader

Re: [jira] Commented: (LUCENE-1482) Replace infoSteram by a logging framework (SLF4J)

2009-01-09 Thread robert engels
. Normally this is done like this: if(Logger.isEnabled(loggername)) { Logger.log(loggername,xxx); } The runtime loader can detect the Logger.isEnabled() byte code, and remove the entire if statement during class loading. On Jan 9, 2009, at 1:11 PM, robert engels wrote: This is not

Re: [jira] Commented: (LUCENE-1482) Replace infoSteram by a logging framework (SLF4J)

2009-01-09 Thread robert engels
This is not really true these days. Dynamic class instrumentation/ byte modification can remove the calls entirely (for loggers not enabled). They can be enabled during startup (or a reload from a different class loader). See the paper at http://www.springerlink.com/content/ur00014m0327542

Re: [jira] Commented: (LUCENE-1476) BitVector implement DocIdSet

2009-01-08 Thread robert engels
The way we've simplified this that every document has an OID. It simplifies updates and delete tracking (in the transaction log). On Jan 8, 2009, at 2:28 PM, Marvin Humphrey (JIRA) wrote: [ https://issues.apache.org/jira/browse/LUCENE-1476? page=com.atlassian.jira.plugin.system.issuetab

Re: [jira] Commented: (LUCENE-1476) BitVector implement DocIdSet

2009-01-07 Thread robert engels
gment is the penalty. Or the policy can be pluggable and the "shared" can use the old bitset method. On Jan 8, 2009, at 12:04 AM, Marvin Humphrey wrote: On Wed, Jan 07, 2009 at 10:36:01PM -0600, robert engels wrote: Yes, and I don't think the "worst-case" is cor

Re: [jira] Commented: (LUCENE-1476) BitVector implement DocIdSet

2009-01-07 Thread robert engels
mall case should probably be less than the size "standard" disk block (which is probably 32k these days, meaning 256k documents). On Jan 7, 2009, at 10:28 PM, Marvin Humphrey wrote: On Wed, Jan 07, 2009 at 09:28:40PM -0600, robert engels wrote: Why not just write the first byte as

Re: [jira] Commented: (LUCENE-1476) BitVector implement DocIdSet

2009-01-07 Thread robert engels
Why not just write the first byte as 0 for a bitsit, and 1 for a sparse bit set (compressed), and make the determination when writing based on the segment size and/or number of set bits. On Jan 7, 2009, at 8:38 PM, Marvin Humphrey (JIRA) wrote: [ https://issues.apache.org/jira/browse/L

Re: [jira] Commented: (LUCENE-1513) fastss fuzzyquery

2009-01-06 Thread robert engels
nto 1 second queries by removing a linear algorithm, i didn't really optimize much beyond that because i was just very happy to have reasonable performance.. On Tue, Jan 6, 2009 at 6:26 PM, robert engels wrote: I understand now. The index in my case would definitely be MUCH larger, bu

Re: [jira] Commented: (LUCENE-1513) fastss fuzzyquery

2009-01-06 Thread robert engels
6, 2009, at 4:29 PM, Robert Muir wrote: On Tue, Jan 6, 2009 at 5:15 PM, robert engels wrote: It is definitely going to increase the index size, but not any more than than the external one would (if my understanding is correct). The nice thing is that you don't have to try and

Re: [jira] Commented: (LUCENE-1513) fastss fuzzyquery

2009-01-06 Thread robert engels
fore nr). For example, searching google for 'robrt engels' works. So does 'obert engels', so does 'robt engels', all ask me if I meant 'robert engels', but searching for 'obrt engels' does not. On Jan 6, 2009, at 4:15 PM, robert engels wrote: I

Re: [jira] Commented: (LUCENE-1513) fastss fuzzyquery

2009-01-06 Thread robert engels
tional information for slop factor for each neighborhood term... i think its worth investigating, maybe performance would actually be better, just curious. i think i boxed myself in to auxiliary index because of some other irrelevant thigns i am doing. On Tue, Jan 6, 2009 at 4:42 PM, robert engels

Re: [jira] Commented: (LUCENE-1513) fastss fuzzyquery

2009-01-06 Thread robert engels
I don't think that is the case. You will have single deletion neighborhood. The number of unique terms in the field is going to be the union of the deletion dictionaries of each source term. For example, given the following documents A which have field 'X' with value best, and document B wi

Re: [jira] Commented: (LUCENE-1513) fastss fuzzyquery

2009-01-06 Thread robert engels
Why not just create a new field for this? That is, if you have FieldA, create field FieldAFuzzy and put the various permutations there. The fuzzy scorer/parser can be changed to automatically use the Fuzzy field when required. You could also store positions, and allow that the first ter

Re: Realtime Search

2009-01-05 Thread robert engels
Then your comments are misdirected. On Jan 5, 2009, at 1:19 PM, Doug Cutting wrote: Robert Engels wrote: Do what you like. You obviously will. This is the problem with the Lucene managers - the problems are only the ones they see - same with the solutions. If the solution (or questions

Re: Realtime Search

2008-12-26 Thread Robert Engels
tions. If the solution (or questions) put them outside their comfort zone, they are ignored or dismissed in a tone that is designed to limit any further questions (especially those that might question their ability and/or understanding). -Original Message- >From: Marvin Humphr

Re: Realtime Search

2008-12-26 Thread Robert Engels
There is also the distributed model - but in that case each node is running some sort of server anyway (as in Hadoop). It seems that the distributed model would be easier to develop using Hadoop over the embedded model. -Original Message- >From: Robert Engels >Sent: Dec 26, 200

Re: Realtime Search

2008-12-26 Thread Robert Engels
If you move to the "either embedded, or server model", the post reopen is trivial, as the structures can be created as the segment is written. It is the networked shared access model that causes a lot of these optimizations to be far more complex than needed. Would it maybe be simpler to move t

Re: Realtime Search

2008-12-26 Thread Robert Engels
This is what we mostly do, but we serialize the documents to a log file first, so if server crashes before the background merge of the RAM segments into the disk segments completes, we can replay the operations on server restart. Since the serialize is a sequential write to an already open file,

Re: Realtime Search

2008-12-26 Thread Robert Engels
to be significantly smaller (improving the write time, and the cache efficiency). -Original Message- >From: Robert Engels >Sent: Dec 26, 2008 11:30 AM >To: java-dev@lucene.apache.org, java-dev@lucene.apache.org >Subject: Re: Realtime Search > >That could very well be, but

Re: Realtime Search

2008-12-26 Thread Robert Engels
1:31 PM >To: java-dev@lucene.apache.org >Subject: Re: Realtime Search > >On Wed, Dec 24, 2008 at 12:02:24PM -0600, robert engels wrote: >> As I understood this discussion though, it was an attempt to remove >> the in memory 'skip to' index, to avoid the reading of th

Re: Realtime Search

2008-12-24 Thread robert engels
On Dec 24, 2008, at 12:23 PM, Jason Rutherglen wrote: > Also, what are the requirements? Must a document be visible to search within 10ms of being added? 0-5ms. Otherwise it's not realtime, it's batch indexing. The realtime system can support small batches by encoding them into RAMDir

Re: Realtime Search

2008-12-24 Thread robert engels
wrote: Op Wednesday 24 December 2008 17:51:04 schreef robert engels: Thinking about this some more, you could use fixed length pages for the term index, with a page header containing a count of entries, and use key compression (to avoid the constant entry size). The problem with this is that

Re: Realtime Search

2008-12-24 Thread robert engels
ss, it will be a winner ! On Dec 23, 2008, at 11:02 PM, robert engels wrote: Seems doubtful you will be able to do this without increasing the index size dramatically. Since it will need to be stored "unpacked" (in order to have random access), yet the terms are variable length -

Re: Realtime Search

2008-12-23 Thread robert engels
overall than accessing the entire mapped file directly on every invocation. On Dec 23, 2008, at 9:20 PM, Marvin Humphrey wrote: On Tue, Dec 23, 2008 at 08:36:24PM -0600, robert engels wrote: Is there something that I am missing? Yes. I see lots of references to using "memory mapp

Re: Realtime Search

2008-12-23 Thread robert engels
fter an optimize the visited pages will not be in the cache (or in core). On Dec 23, 2008, at 9:20 PM, Marvin Humphrey wrote: On Tue, Dec 23, 2008 at 08:36:24PM -0600, robert engels wrote: Is there something that I am missing? Yes. I see lots of references to using "memory mapped"

Re: Realtime Search

2008-12-23 Thread robert engels
Is there something that I am missing? I see lots of references to using "memory mapped" files to "dramatically" improve performance. I don't think this is the case at all. At the lowest levels, it is somewhat more efficient from a CPU standpoint, but with a decent OS cache the IO performanc

Re: [jira] Created: (LUCENE-1486) Wildcards, ORs etc inside Phrase queries

2008-12-10 Thread robert engels
If wildcards and fuzzyies are supported, why not range ? We have a custom "range in phrase" parser, and it works really well, but we would like to use standard Lucene is possible. On Dec 10, 2008, at 12:18 PM, Mark Harwood (JIRA) wrote: Wildcards, ORs etc inside Phrase queries -

[jira] Commented: (LUCENE-1475) Expose sub-IndexReaders from MultiReader or MultiSegmentReader

2008-12-09 Thread robert engels (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12654924#action_12654924 ] robert engels commented on LUCENE-1475: --- That is not correct. By returning a

Re: [jira] Commented: (LUCENE-1473) Implement standard Serialization across Lucene versions

2008-12-08 Thread robert engels
arder, but yields insane parsing performance. Is there any reason to worry about library-bundled parsers if you're making something more complex then a college project? On Tue, Dec 9, 2008 at 01:49, robert engels <[EMAIL PROTECTED]> wrote: The problem with that is that in most cases you s

Re: [jira] Commented: (LUCENE-1473) Implement standard Serialization across Lucene versions

2008-12-08 Thread robert engels
cher wrote: Well, there's the pretty sophisticated and extensible XML query parser in contrib. I've still only scratched the surface of it, but it meets the specs you mentioned. Erik On Dec 8, 2008, at 4:51 PM, robert engels wrote: I think an important piece to make this

Re: [jira] Commented: (LUCENE-1473) Implement standard Serialization across Lucene versions

2008-12-08 Thread robert engels
I think an important piece to make this work is the query parser/syntax. We already have a system similar to what is outlined below. We made changes to the query syntax to support our various query extensions. The nice thing, is that persisting queries is a simple string. It also makes it

[jira] Commented: (LUCENE-1475) Expose sub-IndexReaders from MultiReader or MultiSegmentReader

2008-12-08 Thread robert engels (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12654518#action_12654518 ] robert engels commented on LUCENE-1475: --- I think the API is wrong. The me

Re: [jira] Commented: (LUCENE-831) Complete overhaul of FieldCache API/Implementation

2008-12-07 Thread robert engels
One thing to keep in mind about using the field cache for filter caching. The filter bitset cache at worst holds 8 documents per byte (and with bitset compression this can be even more efficient). Using the field cache is going to rather be bytes per document, most likely at least an orde

[jira] Commented: (LUCENE-1476) BitVector implement DocIdSet

2008-12-05 Thread robert engels (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12653954#action_12653954 ] robert engels commented on LUCENE-1476: --- That's my point, in complex mult

[jira] Commented: (LUCENE-1476) BitVector implement DocIdSet

2008-12-05 Thread robert engels (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12653912#action_12653912 ] robert engels commented on LUCENE-1476: --- but IndexReader.document(n) throw

[jira] Commented: (LUCENE-1476) BitVector implement DocIdSet

2008-12-05 Thread robert engels (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12653793#action_12653793 ] robert engels commented on LUCENE-1476: --- I don't think you can change th

Re: jira attachments ?

2008-12-04 Thread robert engels
r version). It caused me to switch [back] to Firefox. Try Firefox? Mike robert engels wrote: I am using Safari 3.2 (on OSX Tiger). On Dec 4, 2008, at 5:38 PM, Michael McCandless wrote: Robert which browser are you using? Mike robert engels wrote: Dear God, I've been blocked !

Re: [jira] Commented: (LUCENE-855) MemoryCachedRangeFilter to boost performance of Range queries

2008-12-04 Thread robert engels
On Dec 4, 2008, at 4:10 PM, Paul Elschot wrote: Op Thursday 04 December 2008 23:03:40 schreef robert engels: The biggest benefit I see of using the field cache to do filter caching, is that the same cache can be used for sorting - thereby improving the performance and memory usage. Would it

Re: jira attachments ?

2008-12-04 Thread robert engels
I am using Safari 3.2 (on OSX Tiger). On Dec 4, 2008, at 5:38 PM, Michael McCandless wrote: Robert which browser are you using? Mike robert engels wrote: Dear God, I've been blocked ! What will the Lucene community do ! :) On Dec 4, 2008, at 3:27 PM, Uwe Schindler wrote: Hi R

Re: [jira] Commented: (LUCENE-855) MemoryCachedRangeFilter to boost performance of Range queries

2008-12-04 Thread robert engels
more memory, as every field used needs to be cached. With my code you would only have a single "bitset" for the filter. On Dec 4, 2008, at 4:00 PM, robert engels wrote: Lucene-831 is far more comprehensive. I also think that by exposing access to the sub-readers it can be f

Re: [jira] Commented: (LUCENE-855) MemoryCachedRangeFilter to boost performance of Range queries

2008-12-04 Thread robert engels
PROTECTED] ____ From: robert engels [mailto:[EMAIL PROTECTED] Sent: Thursday, December 04, 2008 9:39 PM To: java-dev@lucene.apache.org Subject: Re: [jira] Commented: (LUCENE-855) MemoryCachedRangeFilter to boost performance of Range queries I can't seem to pos

Re: jira attachments ?

2008-12-04 Thread robert engels
Dear God, I've been blocked ! What will the Lucene community do ! :) On Dec 4, 2008, at 3:27 PM, Uwe Schindler wrote: Hi Robert, two minutes ago I uploaded a patch... Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: [EMAIL PROTECTED] From: r

Re: [jira] Commented: (LUCENE-855) MemoryCachedRangeFilter to boost performance of Range queries

2008-12-04 Thread robert engels
I can't seem to post to Jira, so I am attaching here...I attached QueryFilter.java.In reading this patch, and other similar ones, the problem seems to be that if the index is modified, the cache is invalidated, causing a complete reload of the cache. Do I have this correct?The attached patch works

jira attachments ?

2008-12-04 Thread robert engels
I am having a problem posting an attachment to Jira. Just spins, and spins... Everything else seems to work fine (comments, etc.). Anyone else experiencing this? Thanks. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additi

[jira] Commented: (LUCENE-1473) Implement standard Serialization across Lucene versions

2008-12-04 Thread robert engels (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12653421#action_12653421 ] robert engels commented on LUCENE-1473: --- Even if you changed SUIDs based on ver

Re: [jira] Commented: (LUCENE-1473) Implement Externalizable in main top level searcher classes

2008-12-03 Thread robert engels
wrong way, that is their fault, not Lucene's Making something protected is very different than making it public. Robert Engels On Dec 3, 2008, at 11:36 PM, John Wang wrote: Grant: I am sorry that I disagree with some points: 1) "I think it's a sign that Luc

[jira] Commented: (LUCENE-1476) BitVector implement DocIdSet

2008-12-03 Thread robert engels (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12653076#action_12653076 ] robert engels commented on LUCENE-1476: --- BitSet is already random access, DocI

[jira] Commented: (LUCENE-1473) Implement standard Serialization across Lucene versions

2008-12-03 Thread robert engels (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12653058#action_12653058 ] robert engels commented on LUCENE-1473: --- Even better. Thanks Mark. > Im

[jira] Commented: (LUCENE-1473) Implement Externalizable in main top level searcher classes

2008-12-03 Thread robert engels (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12652962#action_12652962 ] robert engels commented on LUCENE-1473: --- The reason the XML is not needed

[jira] Commented: (LUCENE-1473) Implement Externalizable in main top level searcher classes

2008-12-03 Thread robert engels (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12652940#action_12652940 ] robert engels commented on LUCENE-1473: --- Jason, you are only partially cor

[jira] Commented: (LUCENE-1473) Implement Externalizable in main top level searcher classes

2008-12-03 Thread robert engels (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12652892#action_12652892 ] robert engels commented on LUCENE-1473: --- In regards to Doug's comment

[jira] Commented: (LUCENE-1473) Implement Externalizable in main top level searcher classes

2008-12-03 Thread robert engels (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12652843#action_12652843 ] robert engels commented on LUCENE-1473: --- I don't see why you can

[jira] Commented: (LUCENE-1472) DateTools.stringToDate() can cause lock contention under load

2008-12-01 Thread robert engels (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12652221#action_12652221 ] robert engels commented on LUCENE-1472: --- The last comment was tested using Ja

[jira] Commented: (LUCENE-1472) DateTools.stringToDate() can cause lock contention under load

2008-12-01 Thread robert engels (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12652220#action_12652220 ] robert engels commented on LUCENE-1472: --- If you review the source

Re: Option to fsync files

2008-11-19 Thread robert engels
It is not just database by the way, any journaling file system would be pointless... On Nov 19, 2008, at 12:55 PM, robert engels wrote: The "utility" referenced no longer exists... and its no wonder. If is most likely that the tester did not have the drives configured prop

Re: Option to fsync files

2008-11-19 Thread robert engels
I don't believe it - unless the older drives had no cache, in which case it wouldn't matter. It is also doubtful at the OS level, as system integrity would be hopelessly compromised... On Nov 19, 2008, at 12:11 PM, Mark Miller wrote: robert engels wrote: There is an option on

Re: Option to fsync files

2008-11-19 Thread robert engels
The "utility" referenced no longer exists... and its no wonder. If is most likely that the tester did not have the drives configured properly. In almost all cases, if the drive did this, you could not run a database system with any resiliency. They would also have problems with shutdown -

Re: Option to fsync files

2008-11-19 Thread robert engels
lies to you, there is more testing of it. I have also seen bits or pieces about it elsewhere. I choose to believe myself, but I will admit I was 100% wrong about santa clause, so take it for what its worth. I havn't tested it at all. robert engels wrote: I would really like to se

Re: Option to fsync files

2008-11-19 Thread robert engels
I would really like to see some PROOF of these drives "lying". If that were the case, no database system would ever be reliable on these drives ! And data corruption would be happening all over the place ! On Nov 19, 2008, at 10:56 AM, Mark Miller wrote: Michael McCandless wrote: Mark

Re: Allow IndexReader to take ownership of Directory

2008-11-18 Thread robert engels
Why not create new lightweight references to the the directory, and using WeakReferences and ReferenceQueues and avoid the need to manually use incRef and decRef ? Tracking state like this almost always leads to problems - this is why Java has GC in the first place - because it is very diff

Re: RAM memory problems dealing with documents

2008-10-25 Thread robert engels
Trying running the application with -Xmx128m You may find that it works fine. Java will continue to allocate memory if it needs it - it you allow it to. It does this for performance reasons. If the app runs in a tight loop, this could be your problem. If that is not the case, it is most

[jira] Commented: (LUCENE-1383) Work around ThreadLocal's "leak"

2008-10-01 Thread robert engels (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12636095#action_12636095 ] robert engels commented on LUCENE-1383: --- You cannot control this 'e

[jira] Commented: (LUCENE-1383) Work around ThreadLocal's "leak"

2008-10-01 Thread robert engels (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12636075#action_12636075 ] robert engels commented on LUCENE-1383: --- It doesn't need to be "fix

Re: ThreadLocal causing memory leak with J2EE applications

2008-09-11 Thread robert engels
as no sync. Mike robert engels wrote: You still need to sync access to the list, and how would it be removed from the list prior to close? That is you need one per thread, but you can have the reader shared across all threads. So if threads were created and destroyed without ever c

Re: ThreadLocal causing memory leak with J2EE applications

2008-09-11 Thread robert engels
new thread. Retrieving an existing thread has no sync. Mike robert engels wrote: You still need to sync access to the list, and how would it be removed from the list prior to close? That is you need one per thread, but you can have the reader shared across all threads. So if threads were c

Re: ThreadLocal causing memory leak with J2EE applications

2008-09-11 Thread robert engels
o keep it alive. That's it's only purpose. Then, when SegmentReader is closed this list is cleared and GC is free to reclaim all SegmentTermEnums. Mike robert engels wrote: But you need it by thread, so it can't be a list. You could have a HashMap of in FieldsReader, an

Re: ThreadLocal causing memory leak with J2EE applications

2008-09-11 Thread robert engels
bject out from under the ThreadLocal even before the ThreadLocal purges its stale entries. Mike robert engels wrote: You can't hold the ThreadLocal value in a WeakReference, because there is no hard reference between enumeration calls (so it would be cleared out from under you while en

Re: [jira] Updated: (LUCENE-1381) Hanging while indexing/digesting on multiple threads

2008-09-11 Thread robert engels
By the stacktraces, I think there may be a bug in MethodUtils. By it's name it would appear to be static, with a "weak hash map" of names to methods, but it appears that multiple threads are accessing the same map without synchronization This may be wrecking havoc with the WeakReference

[jira] Commented: (LUCENE-1195) Performance improvement for TermInfosReader

2008-09-10 Thread robert engels (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12630091#action_12630091 ] robert engels commented on LUCENE-1195: --- Also, SafeThreadLocal can be trivi

Re: ThreadLocal causing memory leak with J2EE applications

2008-09-10 Thread robert engels
You can't hold the ThreadLocal value in a WeakReference, because there is no hard reference between enumeration calls (so it would be cleared out from under you while enumerating). All of this occurs because you have some objects (readers/segments etc.) that are shared across all threads, b

[jira] Updated: (LUCENE-1195) Performance improvement for TermInfosReader

2008-09-10 Thread robert engels (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] robert engels updated LUCENE-1195: -- Attachment: SafeThreadLocal.java A "safe" ThreadLocal that can be used for more det

Re: ThreadLocal causing memory leak with J2EE applications

2008-09-10 Thread robert engels
x.php?title=Create_Lucene_Database_Search_in_3_minutes DBSight customer, a shopping comparison site, (anonymous per request) got 2.6 Million Euro funding! On Wed, Sep 10, 2008 at 10:39 AM, robert engels <[EMAIL PROTECTED]> wrote: Close() does work - it is just that the memory may not be freed until much later... When

Re: ThreadLocal causing memory leak with J2EE applications

2008-09-10 Thread robert engels
er, a shopping comparison site, (anonymous per request) got 2.6 Million Euro funding! On Wed, Sep 10, 2008 at 10:39 AM, robert engels <[EMAIL PROTECTED]> wrote: Close() does work - it is just that the memory may not be freed until much later... When working with VERY LARGE objects, th

Re: ThreadLocal causing memory leak with J2EE applications

2008-09-10 Thread robert engels
(anonymous per request) got 2.6 Million Euro funding! On Wed, Sep 10, 2008 at 9:10 AM, robert engels <[EMAIL PROTECTED]> wrote: You do not need to create a new RAMDirectory - just write to the existing one, and then reopen() the IndexReader using it. This will prevent lots of big object

Re: ThreadLocal causing memory leak with J2EE applications

2008-09-10 Thread robert engels
x27;t always clear. But it will be far more deterministic. If someone is interested I can post the class, but I think it is well within the understanding of the core Lucene developers. On Sep 10, 2008, at 11:10 AM, robert engels wrote: You do not need to create a new RAMDirectory - just wri

Re: ThreadLocal causing memory leak with J2EE applications

2008-09-10 Thread robert engels
=Create_Lucene_Database_Search_in_3_minutes DBSight customer, a shopping comparison site, (anonymous per request) got 2.6 Million Euro funding! On Tue, Sep 9, 2008 at 10:43 PM, robert engels <[EMAIL PROTECTED]> wrote: You do not need a pool of IndexReaders... It does not matter what class

Re: ThreadLocal causing memory leak with J2EE applications

2008-09-10 Thread robert engels
s using ThreadLocal to cache lots of things by the Thread as the key, and no idea when it'll be released. Of course ThreadLocal is not Lucene's problem... Chris On Wed, Sep 10, 2008 at 8:34 AM, robert engels <[EMAIL PROTECTED]> wrote: It is basic Java. Threads are not g

Re: ThreadLocal causing memory leak with J2EE applications

2008-09-10 Thread robert engels
10:34 AM, robert engels wrote: It is basic Java. Threads are not guaranteed to run on any sort of schedule. If you create lots of large objects in one thread, releasing them in another, there is a good chance you will get an OOM (since the releasing thread may not run before the OOM occurs

Re: ThreadLocal causing memory leak with J2EE applications

2008-09-10 Thread robert engels
.php?title=Create_Lucene_Database_Search_in_3_minutes DBSight customer, a shopping comparison site, (anonymous per request) got 2.6 Million Euro funding! On Wed, Sep 10, 2008 at 7:12 AM, robert engels <[EMAIL PROTECTED]> wrote: Sorry, but I am fairly certain you are mistaken. If you only have a single IndexReader, the RAMD

Re: ThreadLocal causing memory leak with J2EE applications

2008-09-10 Thread robert engels
Lucene Database Search in 3 minutes: http://wiki.dbsight.com/ index.php?title=Create_Lucene_Database_Search_in_3_minutes DBSight customer, a shopping comparison site, (anonymous per request) got 2.6 Million Euro funding! On Tue, Sep 9, 2008 at 10:43 PM, robert engels <[EMAIL PROTECTED]>

Re: ThreadLocal causing memory leak with J2EE applications

2008-09-09 Thread robert engels
As a follow-up, the SegmentTermEnum does contain an IndexInput and based on your configuration (buffer sizes, eg) this could be a large object, so you do need to be careful ! On Sep 10, 2008, at 12:14 AM, robert engels wrote: A searcher uses an IndexReader - the IndexReader is slow to open

Re: ThreadLocal causing memory leak with J2EE applications

2008-09-09 Thread robert engels
http://search.dbsight.com Lucene Database Search in 3 minutes: http://wiki.dbsight.com/ index.php?title=Create_Lucene_Database_Search_in_3_minutes DBSight customer, a shopping comparison site, (anonymous per request) got 2.6 Million Euro funding! On Tue, Sep 9, 2008 at 10:14 PM, robert engels &l

Re: ThreadLocal causing memory leak with J2EE applications

2008-09-09 Thread robert engels
Search in 3 minutes: http://wiki.dbsight.com/ index.php?title=Create_Lucene_Database_Search_in_3_minutes DBSight customer, a shopping comparison site, (anonymous per request) got 2.6 Million Euro funding! On Tue, Sep 9, 2008 at 9:03 PM, robert engels <[EMAIL PROTECTED]> wrote: You

Re: ThreadLocal causing memory leak with J2EE applications

2008-09-09 Thread robert engels
www.dbsight.net demo: http://search.dbsight.com Lucene Database Search in 3 minutes: http://wiki.dbsight.com/ index.php?title=Create_Lucene_Database_Search_in_3_minutes DBSight customer, a shopping comparison site, (anonymous per request) got 2.6 Million Euro funding! On Tue, Sep 9, 2008 at

Re: ThreadLocal causing memory leak with J2EE applications

2008-09-09 Thread robert engels
Your code is not correct. You cannot release it on another thread - the first thread may creating hundreds/thousands of instances before the other thread ever runs... On Sep 9, 2008, at 10:10 PM, Chris Lu wrote: If I release it on the thread that's creating the searcher, by setting searche

[jira] Commented: (LUCENE-753) Use NIO positional read to avoid synchronization in FSIndexInput

2008-08-31 Thread robert engels (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12627313#action_12627313 ] robert engels commented on LUCENE-753: -- SUN is accepting outside bug fixes to

Re: performance optimizations

2008-07-23 Thread robert engels
The bug/issue I was referring to is the pread/multiple file descriptors. This is a clear issue in the JVM, and has been for a long time. Count the hours spent of discussing/devising/debugging/implementing this issue, instead of just having it fixed in the JVM. Not worth the work IMO (and

Re: performance optimizations

2008-07-23 Thread robert engels
t for core library like Lucene where performance is one of the primary features. -Yonik On Wed, Jul 23, 2008 at 4:01 PM, robert engels <[EMAIL PROTECTED]> wrote: I hope this doesn't offend anyone, but I think this is an excellent article that the Lucene development team might fin

performance optimizations

2008-07-23 Thread robert engels
I hope this doesn't offend anyone, but I think this is an excellent article that the Lucene development team might find helpful. I have often been dismayed at complex code being written to achieve "negligible" performance improvements. Most often, a micro benchmark is used to justify the ch

Re: ThreadLocal in SegmentReader

2008-07-14 Thread robert engels
webapp itself and may redeploy other applications). Here a bunch of threads deploy webapp, and they are all different. 2. What if a user just wants to undeploy the webapp, without the redeploy? He expects the memory to be released, but it will not be. Robert Engels wrote: If you attempting to

Re: ThreadLocal in SegmentReader

2008-07-14 Thread robert engels
guaranteed to be removed only when the table starts running out of space." How do you suggest to force the removal of stale entries to the Lucene user? Robert Engels wrote: You are mistaken - Yonik's comment in that thread is correct (although it is not just at table resize - any t

  1   2   3   4   5   6   >