On Mon, Jan 4, 2010 at 5:07 PM, Grant Ingersoll <gsing...@apache.org> wrote: > > On Jan 4, 2010, at 4:19 PM, Yonik Seeley wrote: > >> On Mon, Jan 4, 2010 at 2:29 PM, <gsing...@apache.org> wrote: >>> + public static final double KM_TO_MILES = 0.621371192; >>> + public static final double MILES_TO_KM = 1.609344; >> >> I don't care if these exist, but what are your plans for actually using them? > > Probably premature to commit on my part, I was working on SOLR-1568 and was > allowing the user to pass in the units for the distance value.
I still think it's no simpler for a client, and more complex over all. You either must require units to be passed in (yuck) or decide on default units. Once you have decided on default units, extra parameters for different units is just increased complexity that is just as trivial for the client to implement. They either have to know the code for what units they are using or they have to know how to convert to the standard units - about the same amount of complexity. >> For spatial search, it seems like we should simply standardize on >> something, probably either meters or kilometers and be done with it. >> It's trivial for clients to convert (and clients aren't end-users), >> and will reduce confusion about how to specify units, etc. >> >> Likewise for points/locations - they should simply be lat,lon in >> degrees. No need to specify if it's in radians or degrees when >> degrees is more of an external standard and it's as simple for a >> client to convert as it is to specify. > > Possibly, except you can save a few operations per document if you just store > radians when using haversine. A single multiply (~3cycles?). If that's worth saving, we should just index it that way for the user... but given the computational cost of haversine, it's really in the noise... we should figure out other ways to speed things up. A location in the xml, when using our built-in field types should be unambiguously degrees in lat,lon format. How it's indexed to increase speed, save space, etc, is up to the field type and it's configuration. > I'm just not sure I see this as a big deal. Technically, we could hide all > the complexity of numerics from the user too, but yet we offer ints, floats > and doubles (we could parse them on our side and figure out which is what). But we do hide the complexity of numerics from the user (clients) as much as we can. popularity:10 popularity:[5 TO 10] all work without the client knowing what kind of numeric field is being used (with the exception of plain numerics which are offered only for compatibility with existing lucene indexes). > I'm more of the mindset that I think the app designer should be able to make > the choice, but possibly with some guidance from us as to what is appropriate > for each situation, just as we do with other field types. Yes, absolutely. The app *designer* can make the choices and use the appropriate field types and config, and we should isolate clients from those choices (and changes in those choices) to the degree that it's practical. That's what we currently do. -Yonik http://www.lucidimagination.com