What about rather than conflating field types for creating multiple
fields, use update processors to do the this expansion instead?
Erik
On Nov 26, 2009, at 10:04 AM, Grant Ingersoll wrote:
On Nov 25, 2009, at 8:24 PM, Chris Hostetter wrote:
I'm having a hard time wrapping my head arround this entire
concept ... i
know part of my problem is that your example use case seems somewhat
nonsensical...
: As a simple proof of concept, imagine that I define a new FieldType
: called PlusMinusIntFieldType that extends IntField. This FieldType
: takes in an int value and outputs two Fields: one with the original
: value and one with the negative of the value.
...
: OK, on the search side is where it gets tricky. The whole point
of this
: exercise is that the details are hidden from the user in the
generic
: case. Thus, a query of plusMinus:5 should automatically expand to
: (plusMinus__0:5 OR plusMinus__1:-5). Of course, an expert user
should
...nothing could match plusMinus__0:5 that didn't also match
plusMinus__1:-5, so i don't really understand what the point of
using the
field expansion for a use case like this would be ... and that's
making it
hard for me to try and understand how this sort of system
could/should/would be used at query time.
Kind of, if a user just inputs plusMinus:5, then sure, but they may
also want to just search the negative portion. More importantly,
though, they may have a QParser or some other component that can
appropriately select one of the fields w/o the user knowing.
perhaps a more realistic example would be helpful?
...or even some differnet simple and contrived examples that
demonstrate
how this could be usefull in a way that isn't possible with a single
field.
OK, a more concrete example is spatial. A user will want to index a
point as a lat lon. So, they index: <field name="latLon">49, -79</
field>.
The implementation of how this gets indexed can be done in several
ways. For starters, it can be represented as a single field using
Geohash or even just as a string (even if that isn't useful for
much). We don't need S-1131 for that at all. Next, they may just
want to represent it as a two fields: one for the lat and one for
the long. Again, not super hard to do now, but it requires the user
to set it up, whereas with a LatLonFieldType, this would be hidden
from them. Finally, consider the cartesian tier case. In this
case, a single lat lon point could be mapped to a whole slew of
tiers, where each tier is like a zoom level on a map application
(like Google Maps). Here, we could have a CartesianTierFieldType,
that takes in the lower and upper bounds of the tiers to represent,
i.e. tier 4 through 17, and this would output 13 different
fields. Local Solr currently handles this through dynamic fields
and user level knowledge of the magic fields used.
For this case, there are several different search patterns:
1. The user may know the tier they want to search at and thus input
tier and a zoom level.
2. User invokes a QParser to build a bounding box (see https://issues.apache.org/jira/browse/SOLR-1568)
and the Parser is responsible for creating a filter that chooses
the most appropriate tier to search against. So, the user might
just say: {!tier lat=X lon=Y dist=10} and it will pick the most
appropriate tier, whereas putting in dist=50 would likely pick a
different tier.
Does that help?
BTW, all of this is tracked via SOLR-773.