AW: Possible Issue with DoubleField
Hi Shad, thanks for getting back to me. Regarding the casting: Using your extension solves the issue of not having to cast IIndexableField to Field. Nevertheless, I still need to cast FieldType in order to access the .NumericType value. Probably for the same reason as FieldType is of type IIndexableFieldType. Here's an example of what I mean This works: var field = doc.GetField(fieldName); //need to do this cast in order to access the .NumericType property var fieldType = (FieldType)field.FieldType; Console.WriteLine("Numreric: {0}", fieldType.NumericType); This fails: var field = doc.GetField(fieldName); //trying to access the Field.FieldType.NumericType property fails ( Console.WriteLine("Numreric: {0}", field.FieldType.NumericType); // -> " IIndexableFieldType does not contain a definition for NumericType" It's not a major issue and esay to get around, but does feel strange from a usage perspective. One could probably solve this by using a similar extension method for retrieving the FieldType (which does the casting). Now regarding the Double/Single issue: Sorry for being unclear. I had added the unittest after upgrading. But in the meantime I reverted the sample code back to v4.8.0.770 and it shows the same behavior: a) The StringValue does not properly represent the NumericValue. b) FieldType properties on searching are not the same as during indexing. The later point applies to any type of field. Am I missing something here? Shouldn't the FieldType properties represent the values as they were set dung indexing? The sample code illustrates both issues. Kind regards Alexander -Ursprüngliche Nachricht- Von: Shad Storhaug [mailto:s...@shadstorhaug.com] Gesendet: Freitag, 11. August 2017 06:58 An: Roethinger, Alexander <aroethin...@affili.net> Cc: dev@lucenenet.apache.org Betreff: RE: Possible Issue with DoubleField Hi Alexander, Thanks for the report. To answer your second question, Lucene's design has changed to accommodate a "LazyField" (in the Lucene.Net.Misc package) that is readonly. The return type of Document.GetField(string) is IIndexableField, not Field and therefore typically requires a cast in Java in order to use. To make this easier to work with in .NET, I have added a generic extension method overload so the cast is more obvious. var field = doc.GetField(fieldName); I am trying to avoid using something like public Field GetFieldAsField(), which would result in an exception in the rare case the underlying IIndexableField type could not be cast to a Field type, and it would not be so obvious that you have to call GetField(string) and cast to an alternative type. You can try this approach out by copying the extension method into your project. https://github.com/apache/lucenenet/blob/master/src/Lucene.Net/Support/Docum ent/DocumentExtensions.cs. For the Single/Double field issue, thanks for putting together the code sample. It is a bit unclear from your question though - was this something that worked in 4.8.0.770-beta that quit working in 4.8.0-beta4? Thanks, Shad Storhaug (NightOwl888) -Original Message- From: Roethinger, Alexander [mailto:aroethin...@affili.net] Sent: Friday, August 11, 2017 5:32 AM To: dev@lucenenet.apache.org Subject: Possible Issue with DoubleField Dear Devs, I just updated my search application from v4.8.0.770-beta to v4.8.0-beta4. It required some plausible code adjustments but otherwise worked just fine with all of my unit tests now passing. I did notice something though with DoubleField (and possibly with SingleField as well, Int32Field and Int64Field both work fine) after extending my unit tests for some edge cases using the MaxValues for those types. Maybe I'm just missing something, but I thought I would raise it to the community and would appreciate your thoughts: a) When storing a DoubleField with a value of double.MaxValue, the string representation of the field value is incorrect. Could it be that the Round-Trip Format Specifier "R" is missing? b) When retrieving the same field, the FieldType properties of the retrieved field are not the same as when the field was stored. This results in two challenges: 1) I can't use the Document.Get() method to retrieve the precise value. Instead I have to use GetNumericValue(). 2) When examining the values of FieldType for the retrieved field, the properties do not match those of the stored field, ie. NumericType is set to NONE even though it should be DOUBLE or the value of IsTokenized is changed. I'm not sure if this is expected behavior or not. I would have assumed that FieldType retrieves the values according to the way the field was originally created. The problem with these two points is that I can't easily deduct how to properly retrieve the value based on the value of NumericType just from reading the field. Another point that confuses me: Why do I need to explicitly cast FieldType to
RE: Possible Issue with DoubleField
Hi Alexander, Thanks for the report. To answer your second question, Lucene's design has changed to accommodate a "LazyField" (in the Lucene.Net.Misc package) that is readonly. The return type of Document.GetField(string) is IIndexableField, not Field and therefore typically requires a cast in Java in order to use. To make this easier to work with in .NET, I have added a generic extension method overload so the cast is more obvious. var field = doc.GetField(fieldName); I am trying to avoid using something like public Field GetFieldAsField(), which would result in an exception in the rare case the underlying IIndexableField type could not be cast to a Field type, and it would not be so obvious that you have to call GetField(string) and cast to an alternative type. You can try this approach out by copying the extension method into your project. https://github.com/apache/lucenenet/blob/master/src/Lucene.Net/Support/Document/DocumentExtensions.cs. For the Single/Double field issue, thanks for putting together the code sample. It is a bit unclear from your question though - was this something that worked in 4.8.0.770-beta that quit working in 4.8.0-beta4? Thanks, Shad Storhaug (NightOwl888) -Original Message- From: Roethinger, Alexander [mailto:aroethin...@affili.net] Sent: Friday, August 11, 2017 5:32 AM To: dev@lucenenet.apache.org Subject: Possible Issue with DoubleField Dear Devs, I just updated my search application from v4.8.0.770-beta to v4.8.0-beta4. It required some plausible code adjustments but otherwise worked just fine with all of my unit tests now passing. I did notice something though with DoubleField (and possibly with SingleField as well, Int32Field and Int64Field both work fine) after extending my unit tests for some edge cases using the MaxValues for those types. Maybe I'm just missing something, but I thought I would raise it to the community and would appreciate your thoughts: a) When storing a DoubleField with a value of double.MaxValue, the string representation of the field value is incorrect. Could it be that the Round-Trip Format Specifier "R" is missing? b) When retrieving the same field, the FieldType properties of the retrieved field are not the same as when the field was stored. This results in two challenges: 1) I can't use the Document.Get() method to retrieve the precise value. Instead I have to use GetNumericValue(). 2) When examining the values of FieldType for the retrieved field, the properties do not match those of the stored field, ie. NumericType is set to NONE even though it should be DOUBLE or the value of IsTokenized is changed. I'm not sure if this is expected behavior or not. I would have assumed that FieldType retrieves the values according to the way the field was originally created. The problem with these two points is that I can't easily deduct how to properly retrieve the value based on the value of NumericType just from reading the field. Another point that confuses me: Why do I need to explicitly cast FieldType to access the NumericType property instead of just accessing the FieldType property of the Field? (see line 34: var fieldType = (FieldType)field.FieldType;) The sample ConsoleApplication code below illustrates the behavior. Any feedback is welcome! And thanks for all the great work you have been doing! Kind regards Alexander CODE: using Lucene.Net.Analysis; using Lucene.Net.Analysis.Standard; using Lucene.Net.Documents; using Lucene.Net.Index; using Lucene.Net.Search; using Lucene.Net.Store; using Lucene.Net.Util; using System; namespace LuceneTest { class Program { static void Main(string[] args) { Directory dir = new RAMDirectory(); Analyzer analyzer = new StandardAnalyzer(LuceneVersion.LUCENE_48); IndexWriterConfig iwc = new IndexWriterConfig(LuceneVersion.LUCENE_48, analyzer); double value = double.MaxValue; string fieldName = "DoubleField"; FieldType type = new FieldType(); type.IsIndexed = true; type.IsStored = true; type.IsTokenized = false; type.NumericType = NumericType.DOUBLE; using (IndexWriter writer = new IndexWriter(dir, iwc)) { Document doc = new Document(); var field = new DoubleField(fieldName, value, type); var fieldType = (FieldType)field.FieldType; Console.WriteLine("DoubleField values for indexed value"); Console.WriteLine("StringValue: {0}", field.GetStringValue()); Console.WriteLine("NumericValue: {0:R}", field.GetNumericValue()); Console.WriteLine("IsIndexed: {0}", fieldType.IsIndexed); Console.WriteLine("IsStored: {0}", fieldType.IsStored); Console.WriteLi
Possible Issue with DoubleField
Dear Devs, I just updated my search application from v4.8.0.770-beta to v4.8.0-beta4. It required some plausible code adjustments but otherwise worked just fine with all of my unit tests now passing. I did notice something though with DoubleField (and possibly with SingleField as well, Int32Field and Int64Field both work fine) after extending my unit tests for some edge cases using the MaxValues for those types. Maybe I'm just missing something, but I thought I would raise it to the community and would appreciate your thoughts: a) When storing a DoubleField with a value of double.MaxValue, the string representation of the field value is incorrect. Could it be that the Round-Trip Format Specifier "R" is missing? b) When retrieving the same field, the FieldType properties of the retrieved field are not the same as when the field was stored. This results in two challenges: 1) I can't use the Document.Get() method to retrieve the precise value. Instead I have to use GetNumericValue(). 2) When examining the values of FieldType for the retrieved field, the properties do not match those of the stored field, ie. NumericType is set to NONE even though it should be DOUBLE or the value of IsTokenized is changed. I'm not sure if this is expected behavior or not. I would have assumed that FieldType retrieves the values according to the way the field was originally created. The problem with these two points is that I can't easily deduct how to properly retrieve the value based on the value of NumericType just from reading the field. Another point that confuses me: Why do I need to explicitly cast FieldType to access the NumericType property instead of just accessing the FieldType property of the Field? (see line 34: var fieldType = (FieldType)field.FieldType;) The sample ConsoleApplication code below illustrates the behavior. Any feedback is welcome! And thanks for all the great work you have been doing! Kind regards Alexander CODE: using Lucene.Net.Analysis; using Lucene.Net.Analysis.Standard; using Lucene.Net.Documents; using Lucene.Net.Index; using Lucene.Net.Search; using Lucene.Net.Store; using Lucene.Net.Util; using System; namespace LuceneTest { class Program { static void Main(string[] args) { Directory dir = new RAMDirectory(); Analyzer analyzer = new StandardAnalyzer(LuceneVersion.LUCENE_48); IndexWriterConfig iwc = new IndexWriterConfig(LuceneVersion.LUCENE_48, analyzer); double value = double.MaxValue; string fieldName = "DoubleField"; FieldType type = new FieldType(); type.IsIndexed = true; type.IsStored = true; type.IsTokenized = false; type.NumericType = NumericType.DOUBLE; using (IndexWriter writer = new IndexWriter(dir, iwc)) { Document doc = new Document(); var field = new DoubleField(fieldName, value, type); var fieldType = (FieldType)field.FieldType; Console.WriteLine("DoubleField values for indexed value"); Console.WriteLine("StringValue: {0}", field.GetStringValue()); Console.WriteLine("NumericValue: {0:R}", field.GetNumericValue()); Console.WriteLine("IsIndexed: {0}", fieldType.IsIndexed); Console.WriteLine("IsStored: {0}", fieldType.IsStored); Console.WriteLine("IsTokenized: {0}", fieldType.IsTokenized); Console.WriteLine("Numreric: {0}", fieldType.NumericType); doc.Add(field); writer.AddDocument(doc); writer.Commit(); } Console.WriteLine(); using (IndexReader reader = DirectoryReader.Open(dir)) { IndexSearcher searcher = new IndexSearcher(reader); var hits = searcher.Search(new MatchAllDocsQuery(), 10).ScoreDocs; Document doc = searcher.Doc(hits[0].Doc); var field = doc.GetField(fieldName); var fieldType = (FieldType)field.FieldType; Console.WriteLine("DoubleField values for searched value"); Console.WriteLine("StringValue: {0}", field.GetStringValue()); Console.WriteLine("NumericValue: {0:R}", field.GetNumericValue()); Console.WriteLine("IsIndexed: {0}", fieldType.IsIndexed); Console.WriteLine("IsStored: {0}", fieldType.IsStored); Console.WriteLine("IsTokenized: {0}", fieldType.IsTokenized); Console.WriteLine("Numreric: {0}", fieldType.NumericType); } Console.ReadKey(); dir.Dispose(); } } }