[lucenenet] 02/02: PERFORMANCE: Lucene.Net.Search.Suggest.SortedInputEnumerator: Removed unnecessary call to Reverse() and allocation of HashSet that had been added due to the fact our testing methodology didn't respect set equality (that is, it relied on the order of the collection).

nightowl888 Sun, 14 Mar 2021 05:55:01 -0700

This is an automated email from the ASF dual-hosted git repository.

nightowl888 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/lucenenet.git


commit ec65e26c1c89c80b7194860bf5d1b7eafbee757e
Author: Shad Storhaug <[email protected]>
AuthorDate: Sun Mar 14 16:59:25 2021 +0700

    PERFORMANCE: Lucene.Net.Search.Suggest.SortedInputEnumerator: Removed 
unnecessary call to Reverse() and allocation of HashSet<T> that had been added 
due to the fact our testing methodology didn't respect set equality (that is, 
it relied on the order of the collection).
---
 src/Lucene.Net.Suggest/Suggest/SortedInputIterator.cs      | 14 ++++----------
 .../Suggest/DocumentValueSourceDictionaryTest.cs           |  6 ++++++
 2 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/src/Lucene.Net.Suggest/Suggest/SortedInputIterator.cs 
b/src/Lucene.Net.Suggest/Suggest/SortedInputIterator.cs
index 5154192..45be895 100644
--- a/src/Lucene.Net.Suggest/Suggest/SortedInputIterator.cs
+++ b/src/Lucene.Net.Suggest/Suggest/SortedInputIterator.cs
@@ -2,10 +2,8 @@
 using Lucene.Net.Support;
 using Lucene.Net.Support.IO;
 using Lucene.Net.Util;
-using System;
 using System.Collections.Generic;
 using System.IO;
-using System.Linq;
 using JCG = J2N.Collections.Generic;
 
 namespace Lucene.Net.Search.Suggest
@@ -297,7 +295,6 @@ namespace Lucene.Net.Search.Suggest
             tmpInput.SkipBytes(scratch.Length - 2); //skip to context set size
             ushort ctxSetSize = (ushort)tmpInput.ReadInt16();
             scratch.Length -= 2;
-
             var contextSet = new JCG.HashSet<BytesRef>();
             for (ushort i = 0; i < ctxSetSize; i++)
             {
@@ -311,13 +308,10 @@ namespace Lucene.Net.Search.Suggest
                 contextSet.Add(contextSpare);
                 scratch.Length -= curContextLength;
             }
-
-            // LUCENENET TODO: We are writing the data forward.
-            // Not sure exactly why, but when we read it back it
-            // is reversed. So, we need to fix that before returning the 
result.
-            // If the underlying problem is found and fixed, then this line 
can just be
-            // return contextSet;
-            return new JCG.HashSet<BytesRef>(contextSet.Reverse());
+            // LUCENENET NOTE: The result was at one point reversed because of 
test failures, but since we are
+            // using JCG.HashSet<T> now (whose Equals() implementation 
respects set equality),
+            // we have reverted back to the original implementation.
+            return contextSet;
         }
 
         /// <summary>
diff --git 
a/src/Lucene.Net.Tests.Suggest/Suggest/DocumentValueSourceDictionaryTest.cs 
b/src/Lucene.Net.Tests.Suggest/Suggest/DocumentValueSourceDictionaryTest.cs
index c96ec7b..bb8fc9f 100644
--- a/src/Lucene.Net.Tests.Suggest/Suggest/DocumentValueSourceDictionaryTest.cs
+++ b/src/Lucene.Net.Tests.Suggest/Suggest/DocumentValueSourceDictionaryTest.cs
@@ -161,6 +161,12 @@ namespace Lucene.Net.Search.Suggest
                 assertTrue(inputIterator.Current.equals(new 
BytesRef(doc.Get(FIELD_NAME))));
                 assertEquals(inputIterator.Weight, (w1 + w2 + w3));
                 
assertTrue(inputIterator.Payload.equals(doc.GetField(PAYLOAD_FIELD_NAME).GetBinaryValue()));
+
+                // LUCENENET NOTE: This test was once failing because we used 
SCG.HashSet<T> whose
+                // Equals() implementation does not check for set equality. As 
a result SortedInputEnumerator
+                // had been modified to reverse the results to get the test to 
pass. However, using JCG.HashSet<T>
+                // ensures that set equality (that is equality that doesn't 
care about order of items) is respected.
+                // SortedInputEnumerator has also had the specific sorting 
removed.
                 ISet<BytesRef> originalCtxs = new JCG.HashSet<BytesRef>();
                 foreach (IIndexableField ctxf in 
doc.GetFields(CONTEXTS_FIELD_NAME))
                 {

[lucenenet] 02/02: PERFORMANCE: Lucene.Net.Search.Suggest.SortedInputEnumerator: Removed unnecessary call to Reverse() and allocation of HashSet that had been added due to the fact our testing methodology didn't respect set equality (that is, it relied on the order of the collection).

Reply via email to