On Dec 10, 2009, at 1:44 PM, Grant Ingersoll wrote:

> 
> On Dec 10, 2009, at 1:26 PM, Yonik Seeley wrote:
> 
>> I wouldn't necessarily link FieldType.isPolyField() to the idea of a
>> poly value source... they are two different things.
> 
> Yep.  The word Poly is overloaded here to mean multiple ValueSources, but it 
> isn't necessarily tied to there being a poly field, even though a PolyField 
> likely would create a PolyValueSource
> 
>> For example, if NumericField had not already been written in Lucene, I
>> would have perhaps just indexed both the lat and lon into the same
>> lucene field.  That part can be more of an implementation detail, and
>> does not reflect the semantics of the field (the fact that it contains
>> both a lat and lon).
> 
> Maybe, there are tradeoffs though.  
> 
> Let's get concrete and look at the VectorDistanceFunction (dist()).  It can 
> currently take in an even number of ValueSource instances, and the distance 
> method essentially boils down to (for Euclidean Distance):
> for (int i = 0; i < docValues1.length; i++) {
>        double v = docValues1[i].doubleVal(doc) - docValues2[i].doubleVal(doc);
>        result += v * v;
> }
> result = Math.sqrt(result);
> 
> For example, a call to this might be:
> dist(power = 2, x1, y1, x2, y2) - where xi, yi are ValueSources. //note power 
> = 2 is just me showing what the first parameter is so that no one wonders why 
> there is this extra number in there
> 
> Now, assuming a PointType fields named point1 and point 2 (along with the 
> others above), one could have:
> 
> dist(2,  point1, point2)  //distance between two PointTypes
> dist(2, point1, x1, y2) //distance between a PointType and a user defined 
> point.
> 

D'oh, I see another way of doing this, namely the Distance functions only work 
with points.

Namely, the second case above becomes:
dist(2, point1, point(x1, y1));


-Grant


> While I think this can be coded up in the lat/lon case (i.e. two values) I 
> think it gets hairy when you consider a point in n-dim. space.
> 
> My inclination is to fudge on this and do something in ValueSourceParser for 
> each of the functions that can deal w/ poly fields (my gut says most can't) 
> like:
> addParser("dist", new ValueSourceParser() {
>      public ValueSource parse(FunctionQParser fp) throws ParseException {
>        float power = fp.parseFloat();
>        List<ValueSource> sources = fp.parseValueSourceList();
>        if (sources.size() % 2 != 0) {
>       //expand if needed
>       List newSources = new ...;
>        for each sources
>               if (source is a PolyValueSource){
>                       List<ValueSource> tmp = 
> ((PolyValueSource)source).getValueSources();
>                       newSources.addAll(tmp);
>               else
>                       newSources.add(source);  //just like the old one
>               sources = newSources;
>        //Do the even check again here
>         if (sources.size() % 2 != 0){
>                 throw new SolrException(SolrException.ErrorCode.BAD_REQUEST, 
> "Illegal number of sources.  There must be an even number of sources");
>               }
>        }
>        int dim = sources.size() / 2;
>        List<ValueSource> sources1 = new ArrayList<ValueSource>(dim);
>        List<ValueSource> sources2 = new ArrayList<ValueSource>(dim);
>        splitSources(dim, sources, sources1, sources2);
>        return new VectorDistanceFunction(power, sources1, sources2);
>      }
>    });
> 
> Of course, this requires documentation, etc. for others to be able to do the 
> same for their custom Functions, but that is surmountable.  
> 
> -Grant
> 

--------------------------
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) using 
Solr/Lucene:
http://www.lucidimagination.com/search

Reply via email to