The fix should be in the implementation / port of WeakHashMap, and not in Lucene.Net base code.
I believe Luc is working on providing an alternative patch / fix. -- George > -----Original Message----- > From: Timothy Januario [mailto:timothy.janua...@bdmetrics.com] > Sent: Monday, January 05, 2009 10:19 AM > To: lucene-net-dev@incubator.apache.org > Subject: RE: Problems with patch for LUCENENET-106 > > I stand corrected. I do see the possibility of the collision > as a potentially serious bug. The java version still uses > the object as the key which means that if the hashcode is > identical across all instances, it just becomes less > efficient. In the current .NET implementation, it stays very > efficient, it would just only ever have a single instance in > the cache which is a very different and incorrect result. > > -----Original Message----- > From: Eyal Post [mailto:eyalp...@epocalipse.com] > Sent: Saturday, January 03, 2009 1:56 PM > To: lucene-net-dev@incubator.apache.org > Subject: RE: Problems with patch for LUCENENET-106 > > I tend to agree with Luc. > Hashtables don't rely on small chance for collisions to > operate correctly. > They only rely on it to operate efficiently. > This means you can store 2 elements with the same hashcode in > a hashtable and they won't override each other, but the > hashtable lookups will be less efficient with collisions. > The problem with code provided in the patch is that it uses > the hashcode as the key in the hashtable and altough the > chance of collisions is small once they happen the results > are undefined. > I think the best way to solve that is to use WeakEntry as the > key and override both GetHashCode and Equals. GetHashCode > should return the HashCode field and Equals should compare the Keys > > > Eyal Post > > > > -----Original Message----- > > From: George Aroush [mailto:geo...@aroush.net] > > Sent: Friday, January 02, 2009 0:52 AM > > To: lucene-net-dev@incubator.apache.org > > Subject: RE: Problems with patch for LUCENENET-106 > > > > Hi Luc, > > > > I have to agree with Tim. > > > > The .NET's implementation / port of WeakHashMap in Lucene.Net is > > consistent with Java Lucene. If there is a "very" small > chance of a > > hash key collusion, it is also possible in the Java version > of Lucene. > > Here is why. > > > > In Java, when WeakHashMap.put(Object key, Object value) is called, > > this function calls key.hashCode() to get a hash value (I > had a look > > at Java's source code for WeakHashMap.) The same logic is > happening in > > .NET's implementation / port of WeakHashMap. In addition, at least > > for now, the "key" is always of type IndexReader. Looking in > > IndexReader code, there is no override of Java::hashCode() / > > C#::GetHashCode(). > > So, unless if Java's > > hashCode() guaranties uniqueness (which it doesn't), then the > > possibility of a hash value collusion also exists in the > Java version > > of Lucene. If this is the case, then we have a valid > defect in Lucene > > and it should be addressed. > > > > Erik, do you agree, can you comment? > > > > Regarding #2, & #3, I agree that the .NET implementation of > > WeakHashMap could have been confined to just get() / put(). > > Since you already have a patch for Lucene.Net 2.1, please > submit a new > > JIRA issue and attach your patch to it (when you are > ready.) I look > > forward to your patch. > > > > Regards, and Happy New Year everyone! > > > > -- George > > > > > > > -----Original Message----- > > > From: Timothy Januario [mailto:timothy.janua...@bdmetrics.com] > > > Sent: Wednesday, December 31, 2008 9:44 AM > > > To: lucene-net-dev@incubator.apache.org > > > Subject: RE: Problems with patch for LUCENENET-106 > > > > > > Luc, > > > I'm not sure that I completely agree with your assessment of the > > > WeakHashMap. > > > > > > It does assume that the hash value is unique which is a > > correct usage. > > > It is the responsibility of > > > object.GetHashCode() to return a unique value. This is a > > problem in > > > any Hashtable implementation and although I haven't seen > > the internal > > > java code, I assume that it has the same limitation. The > > object would > > > be the key only superficially as the hashcode would be used > > internally > > > as well (based on the name of the object, WeakHashMap, anyway). > > > If this is incorrect, I apologize for the noise. I did, however, > > > notice that GetHashCode() is not overridden in the > IndexReader and > > > since the .NET framework does not guarantee a unique value > > (according > > > to the documentation for oject.GetHashCode()), this could > > present the > > > problem that you have identified. Is there a need to guarantee > > > uniqueness by overriding the method or does anyone know if > > the value > > > is always unique already? When the Cache object was > > implemented with > > > Hashtable prior to this patch, the same problem would have been > > > inherent with that implementation (of course with the > > additional bonus > > > of the memory leak ;)) and I don't think I ever saw collisions > > > although quite honestly I was never really looking for them. > > > > > > Also, you said that the call to WeakHashMap.Keys could > return NULL > > > values. This may be the reason why the Java implementation > > supports > > > null keys. It does seem to be a possibility unless the > > call to Keys > > > checks that WeakReference.Key.Target is not null at the time of > > > iteration (this would eliminate this possibility since we > would now > > > have a strong reference to the object ie: object key = > > > entries[i].Key.Target; if (object != null) > keys.Add(object);). The > > > same problem would exist with the Values property. > > > > > > Also, optimizing for few IndexReaders in the WeakHashMap is an > > > assumption that specializes the class for IndexReaders. > > > There is no guarantee that this will be the only use case in the > > > future. The Java documentation says that the same initial > > parameters > > > (loadfactor and capacity) as HashMap are used which > > suggests that it > > > should be left in the .NET implementation with the same > > parameters as > > > the default Hashtable which is its underlying container. > > > > > > Comments? > > > -tim > > > > > > -----Original Message----- > > > From: Vanlerberghe, Luc [mailto:luc.vanlerber...@bvdep.com] > > > Sent: Wednesday, December 31, 2008 4:48 AM > > > To: lucene-net-dev@incubator.apache.org > > > Subject: Problems with patch for LUCENENET-106 > > > > > > Hi, > > > > > > I reviewed the fix for LUCENENET-106 and I see a couple of issues: > > > > > > 1. The implementation of WeakHashMap in SupportClass.cs has > > a bug in > > > that it assumes all keys will have a unique HashCode. > > > Although the chances of this occurring are *very* small, it > > cannot be > > > guaranteed. A collision would mean that the cached value for one > > > indexReader could be returned as cached value for another > > indexReader > > > (using the way this class is used in Lucene as example) > > > > > > 2. This class attempts to be a complete port of the > > > java.util.WeakHashMap, but: > > > - A correct implementation is difficult to implement, and > .Net does > > > not have an equivalent of java.lang.ref.ReferenceQueue to > > accelerate > > > the cleanup process. > > > - Since the behaviour of the WeakHashMap is closely tied to the > > > behaviour of the garbage collector, it is very difficult to test > > > properly, if at all. > > > - Lucene only uses the get(Object key) and put(K key, V > > value) methods > > > - In Lucene, the keys are IndexReader instances. In normal > > use cases > > > there will be only 1 IndexReader live, and perhaps another one if > > > 'warming up' is used before switching IndexSearchers > after an index > > > update. > > > - If this class is presented as a complete port of > > > java.util.WeakHashMap, users might assume this is > > production quality > > > code and copy/paste its code in their own projects. > > > Any further bugs found might harm the credibility of the Lucene > > > project, even if those sections are never used in Lucene. > > (e.g.: The > > > collection returned by Keys might contain null values. > There's no > > > guarantee the GC won't intervene between the call to > > Cleanup and the > > > calls to keys.Add(w.Key.Target). > > > > > > 3. Most support classes that are needed for the conversion > > of java to > > > .Net are in the anonymous namespace, but this one is in > > > Lucene.Net.Util. I would propose to keep the namespaces > > corresponding > > > to the original java packages clean and either put all > > support classes > > > in the anonymous namespace or in a Lucene.Net specific one > > > (Lucene.Net.DotNetPort ?) (I see that George moved it already to > > > SupportClass, thanks George!) > > > > > > I have been using the java version of Lucene in a project > > for a long > > > time. > > > For a new project that is currently under development, we will be > > > using the .Net version. > > > For now we are using the 2.1 version using a custom patch > > to avoid the > > > memory leak problem. > > > > > > I'll post an up to date version of this patch as a > > replacement for the > > > current implementation of Lucene.Net.Util.WeakHashMap > sometime next > > > week... > > > - It doesn't try to be a complete implementation of > > > WeakHashMap: only the methods strictly necessary are > > implemented, all > > > other methods throw a NotImplementedException. (It should > > probably be > > > renamed to SimplifiedWeakHashMap or something) > > > - It's optimized for the 'low number of keys' case. > > > - It's simple code, but I want to test it thoroughly first. > > > The original patch was put directly in the FieldCacheImpl > > code, I want > > > to make sure I didn't make some stupid mistake while > converting it. > > > > > > I have an account on Jira, but I don't have rights to reopen the > > > issue... > > > > > > Regards, > > > > > > Luc > > > > > > > > > -----Original Message----- > > > From: Digy (JIRA) [mailto:j...@apache.org] > > > Sent: zaterdag 27 december 2008 19:47 > > > To: lucene-net-dev@incubator.apache.org > > > Subject: [jira] Closed: () Lucene.NET (Revision: 603121) > is leaking > > > memory > > > > > > > > > [ > > > https://issues.apache.org/jira/browse/LUCENENET-106?page=com.a > > tlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] > > > > > > Digy closed LUCENENET-106. > > > -------------------------- > > > > > > Resolution: Fixed > > > Assignee: Digy > > > > > > Patches are applied. > > > > > > DIGY. > > > > > > > Lucene.NET (Revision: 603121) is leaking memory > > > > ----------------------------------------------- > > > > > > > > Key: LUCENENET-106 > > > > URL: > > > https://issues.apache.org/jira/browse/LUCENENET-106 > > > > Project: Lucene.Net > > > > Issue Type: Bug > > > > Environment: .NET 2.0 > > > > Reporter: Anton K. > > > > Assignee: Digy > > > > Priority: Critical > > > > Attachments: DIGY-FieldCacheImpl.patch, Digy.rar, > > > luceneSrc_memUsage.patch, Paches for v2.3.1.rar, > > > WeakHashTable+FieldCacheImpl.rar, WeakReferences.rar > > > > > > > > > > > > readerCache Hashtable field (see FieldCacheImpl.cs) never > > > releases some hash items that have closed IndexReader > > object as a key. > > > So a lot of Term instances are never released. > > > > Java version of Lucene uses WeakHashMap and therefore > > > doesn't have this problem. > > > > This bug can be reproduced only when Sort functionality > > > used during search. > > > > See following link for additional information. > > > > http://www.gossamer-threads.com/lists/lucene/java-user/55681 > > > > Steps to reproduce: > > > > 1)Create index > > > > 2) Modify index by IndexWiter; Close IndexWriter > > > > 3) Use IndexSearcher for searching with Sort; Close InexSearcher > > > > 4) Go to step 2 > > > > You'll get OutOfMemoryException after some time of running > > > this algorithm. > > > > > > -- > > > This message is automatically generated by JIRA. > > > - > > > You can reply to this email to add a comment to the issue online. > > > > > > > > > > > > > >