Scalar doesnt mean anything. Point is simple, it is a point in n dimensional space, that is what the data structure provides for fast searching on. Numbers are points in one dimensional space. Think of a number line. On Mar 24, 2016 8:37 AM, "David Smiley" <[email protected]> wrote:
> bq. it wasn't at all clear that the intention was that simple scalars > would now and forever henceforth be referred to as "points". My impression > at the time was that the focus of the Jira was on implementation and > storage level indexing detail rather than the user-facing API level. I see > now that I was wrong about that. It just seems to me that there should have > been a more direct public discussion of eliminating the concept of scalar > values at the API level. > > I knew because I was following closely, but otherwise I agree with your > sentiment. I don't love the "PointValues" terminology either nor did I > like "DimensionalValues"; I should have suggested alternatives at the time > but the Mike & Rob tag-team were working so fast that I didn't interject in > the narrow window of time before a patch was put up with the current > names. More time to publicly discuss would have been better. FWIW I like > your suggestion for "Scalar"; that's more meaningful to me. Naming is hard. > > ~ David > > On Thu, Mar 24, 2016 at 11:28 AM Jack Krupansky <[email protected]> > wrote: > >> I wasn't paying close attention when this whole PointValues saga was >> unfolding. I get the value of points for spatial data, but conflating the >> terms "point" and "numeric" is bizarre to say the least. Reading the code, >> I see "Points represent numeric values", which seems nonsensical to me. A >> little later the code comment says "Geospatial Point Types - Although basic >> point types such as DoublePoint support points in multi-dimensional space >> too, Lucene has specialized classes for location data...", which continues >> this odd use of terminology. I mean, aren't all points spatial by >> definition, so that "Geospatial Point" is redundant? It would make more >> sense to speak of a point as a geospatial number, or that a point is >> represented by numbers. >> >> IOW, NumericValues would make more sense as the base, with (spatial) >> PointValues derived from the base of numeric values. At least to me that >> would make more sense. >> >> As the PointValues was progressing I had no idea that its intent was to >> subsume, replace, or deprecate traditional scalar numeric value support in >> Lucene (or Solr.) It came across primarily as being an improvement for >> spatial search. >> >> Not that I have any objection to greatly improved storage in Lucene, but >> to now have to speak of all numeric data as points seems quite... weird. >> >> Sure, I saw the Jira traffic, like LUCENE-6825 (Add multidimensional >> byte[] indexing support to Lucene) and LUCENE-6852 (Add DimensionalFormat >> to Codec), but in all honesty that really did come across as relating to >> purely spatial data and not being applicable to basic scalar number support. >> >> Looking at CHANGES.TXT, I see references like "LUCENE-6852, LUCENE-6975: >> Add support for points (dimensionally indexed values)", but without any >> hint that the intent was to subsume or replace non-dimensional numeric >> indexed values. >> >> Now for all I know, non-dimensional (scalar) numeric data can very >> efficiently be handled as if it had dimension, but that's not exactly >> obvious and warrants at least some illumination. In traditional terminology >> a point is 0-dimension (a line is 1-dimension, and a plane is 2-dimension), >> but traditionally a raw number - a scalar - hasn't been referred to as >> having dimension, so that is a new concept warranting clear definition. >> >> Yeah, I do recall seeing LUCENE-6917 (Deprecate and rename >> NumericField/RangeQuery to LegacyNumeric) go by in the Jira traffic, and >> shame on me for not reading the details more carefully, but it wasn't at >> all clear that the intention was that simple scalars would now and forever >> henceforth be referred to as "points". My impression at the time was that >> the focus of the Jira was on implementation and storage level indexing >> detail rather than the user-facing API level. I see now that I was wrong >> about that. It just seems to me that there should have been a more direct >> public discussion of eliminating the concept of scalar values at the API >> level. >> >> (I wonder what physics would be like if they started referring to scalar >> quantities as vectors.) >> >> My apologies for the rant. >> >> >> -- Jack Krupansky >> >> On Thu, Mar 24, 2016 at 10:34 AM, David Smiley <[email protected]> >> wrote: >> >>> With the move to PointValues and away from trie based indexing of the >>> terms index, for numerics, everything associated with the trie stuff seems >>> to be labelled as "Legacy" and marked deprecated. Even >>> FieldType.NumericType (now FieldType.LegacyNumericType) -- a simple enum of >>> INT, LONG, FLOAT, DOUBLE. I wonder if we ought to reconsider doing this >>> for FieldType.NumericType, as it articulates the type of numeric data; it >>> need not be associated with just trie indexing of terms data; it could >>> articulate how any numeric data is encoded, be it docValues or >>> pointValues. This is useful metadata. It's not strictly required, true, >>> but its useful in describing what goes in the field. This makes a >>> FieldType instance fairly self-sufficient. Otherwise, say you have >>> docValue numerics and/or pointValues, it's ambiguous how the data should be >>> interpreted. This doesn't lead to a bug but would help debugging and >>> allowing APIs to express field requirements simply by providing a FieldType >>> instance for numeric data. It used to be self sufficient but now if we >>> imagine the legacy stuff being removed, it's ambiguous. In addition, it >>> would be useful metadata if it found it's way into FieldInfo. Then, say >>> Luke, could help you know what's there and maybe search it. >>> >>> Thoughts? >>> >>> ~ David >>> -- >>> Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker >>> LinkedIn: http://linkedin.com/in/davidwsmiley | Book: >>> http://www.solrenterprisesearchserver.com >>> >> >> -- > Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker > LinkedIn: http://linkedin.com/in/davidwsmiley | Book: > http://www.solrenterprisesearchserver.com >
