Re: [Solr Wiki] Update of "FunctionQuery" by GrantIngersoll

Grant Ingersoll Mon, 16 Nov 2009 05:21:44 -0800

On Nov 15, 2009, at 8:16 PM, Yonik Seeley wrote:

> Let's all try to summarize changes to the wiki as we would changes to
> the code - without that it's tough to tell what the changes are.
> 
> In this particular case, I'm not sure if all of the formatting changes
> were deliberate or accidental.  If accidental, I wonder if the cause
> was a bug in GUI mode, or a bug in your browser?
>


Hmm, that is bad.  I used GUI mode and I definitely did not make any of those 
changes.  

For the record, I added the dist, hsin, deg and rad functions.


> -Yonik
> http://www.lucidimagination.com
> 
> 
> 
> On Sat, Nov 14, 2009 at 9:05 AM, Apache Wiki <wikidi...@apache.org> wrote:
>> Dear Wiki user,
>> 
>> You have subscribed to a wiki page or wiki category on "Solr Wiki" for 
>> change notification.
>> 
>> The "FunctionQuery" page has been changed by GrantIngersoll.
>> http://wiki.apache.org/solr/FunctionQuery?action=diff&rev1=29&rev2=30
>> 
>> --------------------------------------------------
>> 
>> - FunctionQuery allows one to use the actual value of a numeric field and 
>> functions of those fields in a relevancy score.
>> + FunctionQuery allows one to use the actual value of a numeric field and 
>> functions of those fields in a relevancy score.
>> 
>>  <<TableOfContents>>
>> 
>>  = Using FunctionQuery =
>>  There are a few ways to use FunctionQuery from Solr's HTTP interface:
>> +
>>   1. Embed a FunctionQuery in a regular query expressed in SolrQuerySyntax 
>> via the _val_ hook
>>   1. Use the FunctionQParserPlugin, ie: {{{q={!func}log(foo)}}}
>>   1. Use a parameter that has an explicit type of FunctionQuery, such as 
>> DisMaxRequestHandler's '''bf''' (boost function) parameter.
>> -     * NOTE: the '''bf''' parameter actually takes a list of function 
>> queries separated by whitespace and each with an optional boost.  Make sure 
>> to eliminate any internal whitespace in single function queries when using 
>> '''bf'''.
>> +   * NOTE: the '''bf''' parameter actually takes a list of function queries 
>> separated by whitespace and each with an optional boost.  Make sure to 
>> eliminate any internal whitespace in single function queries when using 
>> '''bf'''.
>> -     * Example: {{{q=foo&bf="ord(popularity)^0.5 
>> recip(rord(price),1,1000,1000)^0.3"}}}
>> +   * Example: {{{q=foo&bf="ord(popularity)^0.5 
>> recip(rord(price),1,1000,1000)^0.3"}}}
>> 
>>  See SolrPlugins#ValueSourceParser for information on how to hook in your 
>> own FunctionQuery.
>> 
>> @@ -20, +21 @@
>> 
>>  There is currently no infix parser - functions must be expressed as 
>> function calls (e.g. sum(a,b) instead of a+b)
>> 
>>  = Available Functions =
>> -
>>  == constant ==
>> - <!> [[Solr1.3]]
>> - Floating point constants.
>> + <!> [[Solr1.3]] Floating point constants.
>> +
>> -     Example Syntax: '''1.5'''
>> +  . Example Syntax: '''1.5'''
>> -
>> -     SolrQuerySyntax Example: '''_val_:1.5'''
>> +  SolrQuerySyntax Example: '''_val_:1.5'''
>> 
>>  == fieldvalue ==
>>  This function returns the numeric field value of an indexed field with a 
>> maximum of one value per document (not multiValued).  The syntax is simply 
>> the field name by itself.  0 is returned for documents without a value in 
>> the field.
>> +
>> -     Example Syntax: '''myFloatField'''
>> +  . Example Syntax: '''myFloatField'''
>> -
>> -     SolrQuerySyntax Example: '''_val_:myFloatField'''
>> +  SolrQuerySyntax Example: '''_val_:myFloatField'''
>> 
>>  == ord ==
>>  ord(myfield) returns the ordinal of the indexed field value within the 
>> indexed list of terms for that field in lucene index order 
>> (lexicographically ordered by unicode value), starting at 1. In other words, 
>> for a given field, all values are ordered lexicographically; this function 
>> then returns the offset of a particular value in that ordering. The field 
>> must have a maximum of one value per document (not multiValued).  0 is 
>> returned for documents without a value in the field.
>> +
>> -    Example: If there were only three values for a particular field: 
>> "apple","banana","pear", then ord("apple")=1, ord("banana")=2, ord("pear")=3
>> +  . Example: If there were only three values for a particular field: 
>> "apple","banana","pear", then ord("apple")=1, ord("banana")=2, ord("pear")=3
>> -
>> -    Example Syntax: '''ord(myIndexedField)'''
>> +  Example Syntax: '''ord(myIndexedField)'''
>> -
>> -    Example SolrQuerySyntax: '''_val_:"ord(myIndexedField)"'''
>> +  Example SolrQuerySyntax: '''_val_:"ord(myIndexedField)"'''
>> 
>> + WARNING: as of Solr 1.4, ord() and rord() can cause excess memory use 
>> since they must use a FieldCache entry at the top level reader, while 
>> sorting and function queries now use entries at the segment level.  Hence 
>> sorting or using a different function query, in addition to ord()/rord() 
>> will double memory use.
>> - WARNING: as of Solr 1.4, ord() and rord() can cause excess memory use 
>> since they must use a FieldCache entry
>> - at the top level reader, while sorting and function queries now use 
>> entries at the segment level.  Hence sorting
>> - or using a different function query, in addition to ord()/rord() will 
>> double memory use.
>> -
>> 
>>  WARNING: ord() depends on the position in an index and can thus change when 
>> other documents are inserted or deleted, or if a !MultiSearcher is used.
>> 
>>  == rord ==
>>  The reverse ordering of what ord provides.
>> +
>> -     Example Syntax: '''rord(myIndexedField)'''
>> +  . Example Syntax: '''rord(myIndexedField)'''
>> -
>> -     Example: '''rord(myDateField)''' is a metric for how old a document 
>> is: the youngest document will return 1, the oldest document will return the 
>> total number of documents.
>> +  Example: '''rord(myDateField)''' is a metric for how old a document is: 
>> the youngest document will return 1, the oldest document will return the 
>> total number of documents.
>> 
>> + WARNING: as of Solr 1.4, ord() and rord() can cause excess memory use 
>> since they must use a FieldCache entry at the top level reader, while 
>> sorting and function queries now use entries at the segment level.  Hence 
>> sorting or using a different function query, in addition to ord()/rord() 
>> will double memory use.
>> -
>> - WARNING: as of Solr 1.4, ord() and rord() can cause excess memory use 
>> since they must use a FieldCache entry
>> - at the top level reader, while sorting and function queries now use 
>> entries at the segment level.  Hence sorting
>> - or using a different function query, in addition to ord()/rord() will 
>> double memory use.
>> 
>>  == sum ==
>> - <!> [[Solr1.3]]
>> - sum(x,y,...) returns the sum of multiple functions.
>> + <!> [[Solr1.3]] sum(x,y,...) returns the sum of multiple functions.
>> +
>> -     Example Syntax: '''sum(x,1)'''
>> +  . Example Syntax: '''sum(x,1)'''
>> -
>> -     Example Syntax: '''sum(x,y)'''
>> +  Example Syntax: '''sum(x,y)'''
>> -
>> -     Example Syntax: '''sum(sqrt(x),log(y),z,0.5)'''
>> +  Example Syntax: '''sum(sqrt(x),log(y),z,0.5)'''
>> 
>>  == sub ==
>> - <!> [[Solr1.4]]
>> - sub(x,y) returns x-y
>> + <!> [[Solr1.4]] sub(x,y) returns x-y
>> +
>> -     Example: '''sub(myfield,myfield2)'''
>> +  . Example: '''sub(myfield,myfield2)'''
>> -
>> -     Example: '''sub(100,sqrt(myfield))'''
>> +  Example: '''sub(100,sqrt(myfield))'''
>> 
>>  == product ==
>> - <!> [[Solr1.3]]
>> - product(x,y,...) returns the product of multiple functions.
>> + <!> [[Solr1.3]] product(x,y,...) returns the product of multiple functions.
>> +
>> -     Example Syntax: '''product(x,2)'''
>> +  . Example Syntax: '''product(x,2)'''
>> -
>> -     Example Syntax: '''product(x,y)'''
>> +  Example Syntax: '''product(x,y)'''
>> 
>>  == div ==
>> - <!> [[Solr1.3]]
>> - div(x,y) divides the function x by the function y.
>> + <!> [[Solr1.3]] div(x,y) divides the function x by the function y.
>> +
>> -     Example Syntax: '''div(1,x)'''
>> +  . Example Syntax: '''div(1,x)'''
>> -
>> -     Example Syntax: '''div(sum(x,100),max(y,1))'''
>> +  Example Syntax: '''div(sum(x,100),max(y,1))'''
>> 
>>  == pow ==
>> - <!> [[Solr1.3]]
>> - pow(x,y) raises the base x to the power y.
>> + <!> [[Solr1.3]] pow(x,y) raises the base x to the power y.
>> +
>> -     Example Syntax: '''pow(x,0.5)'''   same as sqrt
>> +  . Example Syntax: '''pow(x,0.5)'''   same as sqrt
>> -
>> -     Example Syntax: '''pow(x,log(y))'''
>> +  Example Syntax: '''pow(x,log(y))'''
>> 
>>  == abs ==
>> - <!> [[Solr1.3]]
>> - abs(x) returns the absolute value of a function.
>> + <!> [[Solr1.3]] abs(x) returns the absolute value of a function.
>> +
>> -     Example Syntax: '''abs(-5)'''
>> +  . Example Syntax: '''abs(-5)'''
>> -
>> -     Example Syntax: '''abs(x)'''
>> +  Example Syntax: '''abs(x)'''
>> 
>>  == log ==
>> - <!> [[Solr1.3]]
>> - log(x) returns log base 10 of the function x.
>> + <!> [[Solr1.3]] log(x) returns log base 10 of the function x.
>> +
>> -     Example Syntax: '''log(x)'''
>> +  . Example Syntax: '''log(x)'''
>> -
>> -     Example Syntax: '''log(sum(x,100))'''
>> +  Example Syntax: '''log(sum(x,100))'''
>> 
>>  == sqrt ==
>> - <!> [[Solr1.3]]
>> - sqrt(x) returns the square root of the function x
>> + <!> [[Solr1.3]] sqrt(x) returns the square root of the function x
>> +
>> -     Example Syntax: '''sqrt(2)'''
>> +  . Example Syntax: '''sqrt(2)'''
>> -
>> -     Example Syntax: '''sqrt(sum(x,100))'''
>> +  Example Syntax: '''sqrt(sum(x,100))'''
>> 
>>  == map ==
>> - <!> [[Solr1.3]]
>> - map(x,min,max,target) maps any values of the function x that fall within 
>> min and max inclusive to target.  min,max,target are constants. It outputs 
>> the field's value if it does not fall between min and max.
>> + <!> [[Solr1.3]] map(x,min,max,target) maps any values of the function x 
>> that fall within min and max inclusive to target.  min,max,target are 
>> constants. It outputs the field's value if it does not fall between min and 
>> max.
>> +
>> -     Example Syntax 1: '''map(x,0,0,1)'''  change any values of 0 to 1... 
>> useful in handling default 0 values
>> +  . Example Syntax 1: '''map(x,0,0,1)'''  change any values of 0 to 1... 
>> useful in handling default 0 values
>> -
>> -     Example Syntax 2 <!> [[Solr1.4]]: '''map(x,0,0,1,0)'''  change any 
>> values of 0 to 1 . and if the value is not zero it can be set to the value 
>> of the 5th argument instead of defaulting to the field's value
>> +  Example Syntax 2 <!> [[Solr1.4]]: '''map(x,0,0,1,0)'''  change any values 
>> of 0 to 1 . and if the value is not zero it can be set to the value of the 
>> 5th argument instead of defaulting to the field's value
>> -
>> -
>> -
>> 
>>  == scale ==
>> - <!> [[Solr1.3]]
>> - scale(x,minTarget,maxTarget) scales values of the function x such that 
>> they fall between minTarget and maxTarget inclusive.
>> + <!> [[Solr1.3]] scale(x,minTarget,maxTarget) scales values of the function 
>> x such that they fall between minTarget and maxTarget inclusive.
>> -     Example Syntax: '''scale(x,1,2)'''  all values will be between 1 and 2 
>> inclusive.
>> 
>> -     NOTE: The current implementation currently traverses all of the 
>> function values to obtain the min and max so it can pick the correct scale.
>> +  . Example Syntax: '''scale(x,1,2)'''  all values will be between 1 and 2 
>> inclusive. NOTE: The current implementation currently traverses all of the 
>> function values to obtain the min and max so it can pick the correct scale.
>> -
>> -     NOTE: This implementation currently cannot distinguish when documents 
>> have been deleted or documents that have no value, and 0.0 values will be 
>> used for these cases.  This means that if values are normally all greater 
>> than 0.0, one can still end up with 0.0 as the min value to map from.  In 
>> these cases, an appropriate map() function could be used as a workaround to 
>> change 0.0 to a value in the real range.  example: 
>> '''scale(map(x,0,0,5),1,2)'''
>> +  NOTE: This implementation currently cannot distinguish when documents 
>> have been deleted or documents that have no value, and 0.0 values will be 
>> used for these cases.  This means that if values are normally all greater 
>> than 0.0, one can still end up with 0.0 as the min value to map from.  In 
>> these cases, an appropriate map() function could be used as a workaround to 
>> change 0.0 to a value in the real range.  example: 
>> '''scale(map(x,0,0,5),1,2)'''
>> 
>>  == query ==
>> - <!> [[Solr1.4]]
>> - query(subquery, default) returns the score for the given subquery, or the 
>> default value for documents not matching the query.  Any type of subquery is 
>> supported through either parameter dereferencing {{{$otherparam}}} or direct 
>> specification of the query string in the LocalParams via "v".
>> + <!> [[Solr1.4]] query(subquery, default) returns the score for the given 
>> subquery, or the default value for documents not matching the query.  Any 
>> type of subquery is supported through either parameter dereferencing 
>> {{{$otherparam}}} or direct specification of the query string in the 
>> LocalParams via "v".
>> 
>> -     Example Syntax: '''q=product(popularity, query({!dismax v='solr 
>> rocks'})''' returns the product of the popularity and the score of the 
>> dismax query.
>> +  . Example Syntax: '''q=product(popularity, query({!dismax v='solr 
>> rocks'})''' returns the product of the popularity and the score of the 
>> dismax query.
>> -
>> -     Example Syntax: '''q=product(popularity, query($qq))&qq={!dismax}solr 
>> rocks''' is equivalent to the previous query, using param dereferencing.
>> +  Example Syntax: '''q=product(popularity, query($qq))&qq={!dismax}solr 
>> rocks''' is equivalent to the previous query, using param dereferencing.
>> -
>> -     Example Syntax: '''q=product(popularity, 
>> query($qq,0.1))&qq={!dismax}solr rocks''' specifies a default score of 0.1 
>> for documents that don't match the dismax query.
>> +  Example Syntax: '''q=product(popularity, query($qq,0.1))&qq={!dismax}solr 
>> rocks''' specifies a default score of 0.1 for documents that don't match the 
>> dismax query.
>> 
>>  == linear ==
>>  linear(x,m,c) implements m*x+c where m and c are constants and x is an 
>> arbitrary function.  This is equivalent to '''sum(product(m,x),c)''', but 
>> slightly more efficient as it is implemented as a single function.
>> +
>> -     Example Syntax: '''linear(x,2,4)'''  returns 2*x+4
>> +  . Example Syntax: '''linear(x,2,4)'''  returns 2*x+4
>> 
>>  == recip ==
>>  A reciprocal function with '''recip(x,m,a,b)''' implementing a/(m*x+b).  
>> m,a,b are constants, x is any numeric field or arbitrarily complex function.
>> 
>>  When a and b are equal, and x>=0, this function has a maximum value of 1 
>> that drops as x increases. Increasing the value of a and b together results 
>> in a movement of the entire function to a flatter part of the curve. These 
>> properties can make this an ideal function for boosting more recent 
>> documents when x is rord(datefield).
>> +
>> -     Example Syntax: '''recip(rord(creationDate),1,1000,1000)'''
>> +  . Example Syntax: '''recip(rord(creationDate),1,1000,1000)'''
>> 
>> - <!> [[Solr1.4]]
>> - In Solr 1.4 and later, best practice is to avoid ord() and rord() and 
>> derive the boost directly from the value of the date field.
>> + <!> [[Solr1.4]] In Solr 1.4 and later, best practice is to avoid ord() and 
>> rord() and derive the boost directly from the value of the date field. See 
>> ms() for more details.
>> - See ms() for more details.
>> 
>>  == max ==
>>  max(x,c) returns the max of another function and a constant.  Useful for 
>> "bottoming out" another function at some constant.
>> +
>> -     Example Syntax: '''max(myfield,0)'''
>> +  . Example Syntax: '''max(myfield,0)'''
>> 
>>  == ms ==
>>  <!> [[Solr1.4]]
>> @@ -175, +149 @@
>> 
>>  Arguments may be numerically indexed date fields such as !TrieDate (the 
>> default in 1.4), or date math (examples in SolrQuerySyntax) based on a 
>> constant date or '''NOW'''.
>> 
>>  '''ms()'''
>> +
>> -   Equivalent to '''ms(NOW)''', number of milliseconds since the epoch.
>> +  . Equivalent to '''ms(NOW)''', number of milliseconds since the epoch.
>> +
>>  '''ms(a)'''
>> +
>> -   Returns the number of milliseconds since the epoch that the argument 
>> represents.
>> +  . Returns the number of milliseconds since the epoch that the argument 
>> represents.
>> -
>> -   Example: '''ms(NOW/DAY)'''
>> +  Example: '''ms(NOW/DAY)'''
>> -
>> -   Example: '''ms(2000-01-01T00:00:00Z)'''
>> +  Example: '''ms(2000-01-01T00:00:00Z)'''
>> -
>> -   Example: '''ms(mydatefield)'''
>> +  Example: '''ms(mydatefield)'''
>> +
>>  '''ms(a,b)'''
>> +
>> -   Returns the number of milliseconds that {{{b}}} occurs before {{{a}}} 
>> (i.e. {{{a - b}}}).  Note that this offers higher precision than 
>> '''sub(a,b)''' because the arguments are not converted to floating point 
>> numbers before subtraction.
>> +  . Returns the number of milliseconds that {{{b}}} occurs before {{{a}}} 
>> (i.e. {{{a - b}}}).  Note that this offers higher precision than 
>> '''sub(a,b)''' because the arguments are not converted to floating point 
>> numbers before subtraction.
>> -
>> -   Example: '''ms(NOW,mydatefield)'''
>> +  Example: '''ms(NOW,mydatefield)'''
>> -
>> -   Example: '''ms(mydatefield,2000-01-01T00:00:00Z)'''
>> +  Example: '''ms(mydatefield,2000-01-01T00:00:00Z)'''
>> -
>> -   Example: '''ms(datefield1,datefield2)'''
>> +  Example: '''ms(datefield1,datefield2)'''
>> +
>> + == dist ==
>> + [[Solr1.5]]
>> +
>> + Return the Distance between two Vectors (points) in an n-dimensional 
>> space.  See http://en.wikipedia.org/wiki/Lp_space for more information.  
>> Takes in the power, plus two or more !ValueSource instances and calculates 
>> the distances between the two vectors.  Each !ValueSource must be a number.
>> +
>> + Common cases:
>> +
>> +  ||<tablestyle="width: 467px; height: 88px;">Power ||Common Name ||
>> +  ||0 ||Sparseness calculation ||
>> +  ||1||Manhattan (taxicab) Distance||
>> +  ||2||Euclidean Distance||
>> +  ||Infinite||Infinite norm - maximum value in the vector||
>> +
>> +
>> 
>>  === Date Boosting ===
>>  Boosting more recent content is a common use case.  One way is to use a 
>> {{{recip}}} function in conjunction with {{{ms}}}.
>> @@ -203, +191 @@
>> 
>>  Also see 
>> http://wiki.apache.org/solr/SolrRelevancyFAQ#How_can_I_boost_the_score_of_newer_documents
>> 
>>  == top ==
>> - <!> [[Solr1.4]]
>> - Causes it's function query argument to derive it's values from the 
>> top-level IndexReader containing all parts of an index.  For example, the 
>> ordinal of a value in a single segment will be different from the ordinal of 
>> that same value in the complete index.  The ord() and rord() functions 
>> implicitly use top() and hence ord(foo) is equivalent to top(ord(foo)).
>> + <!> [[Solr1.4]] Causes it's function query argument to derive it's values 
>> from the top-level IndexReader containing all parts of an index.  For 
>> example, the ordinal of a value in a single segment will be different from 
>> the ordinal of that same value in the complete index.  The ord() and rord() 
>> functions implicitly use top() and hence ord(foo) is equivalent to 
>> top(ord(foo)).
>> 
>>  = General Example =
>> -
>> - To give more idea about the use of the function query, suppose index 
>> stores dimensions in meters '''x''', '''y''','''z''' of some hypothetical 
>> boxes with arbitrary names stored in field '''boxname'''.
>> + To give more idea about the use of the function query, suppose index 
>> stores dimensions in meters '''x''', '''y''','''z''' of some hypothetical 
>> boxes with arbitrary names stored in field '''boxname'''. Suppose we want to 
>> search for box matching name ''findbox'' but ranked according to volumes of 
>> boxes, the query params would be:
>> - Suppose we want to search for box matching name ''findbox'' but ranked 
>> according to volumes of boxes, the query params would be:
>> +
>>  {{{
>>    q=boxname:findbox+_val_:"product(product(x,y),z)"
>>  }}}
>> -
>>  Although this will rank the results based on volumes but in order to get 
>> the computed volume you will need to add parameter...
>> +
>>  {{{
>>    &fl=*,score
>>  }}}
>> -
>>  ...where '''score''' will contain the resultant volume.
>> 
>>  Suppose you also have a field containing weight of the box as 'weight', 
>> then to sort by the density of the box and return the value of the density 
>> in score you query should be...
>> @@ -226, +211 @@
>> 
>>  {{{
>>  
>> http://localhost:8983/solr/select/?q=boxname:findbox+_val_:"div(weight,product(product(x,y),z))"&fl=boxname,x,y,z,weight,score
>>  }}}
>> -
>> 
>> 

--------------------------
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) using 
Solr/Lucene:
http://www.lucidimagination.com/search

Re: [Solr Wiki] Update of "FunctionQuery" by GrantIngersoll

Reply via email to