Re: [Solr Wiki] Update of "FunctionQuery" by GrantIngersoll

Yonik Seeley Sun, 15 Nov 2009 17:17:25 -0800

Let's all try to summarize changes to the wiki as we would changes to
the code - without that it's tough to tell what the changes are.


In this particular case, I'm not sure if all of the formatting changes
were deliberate or accidental.  If accidental, I wonder if the cause
was a bug in GUI mode, or a bug in your browser?

-Yonik
http://www.lucidimagination.com



On Sat, Nov 14, 2009 at 9:05 AM, Apache Wiki <wikidi...@apache.org> wrote:
> Dear Wiki user,
>
> You have subscribed to a wiki page or wiki category on "Solr Wiki" for change 
> notification.
>
> The "FunctionQuery" page has been changed by GrantIngersoll.
> http://wiki.apache.org/solr/FunctionQuery?action=diff&rev1=29&rev2=30
>
> --------------------------------------------------
>
> - FunctionQuery allows one to use the actual value of a numeric field and 
> functions of those fields in a relevancy score.
> + FunctionQuery allows one to use the actual value of a numeric field and 
> functions of those fields in a relevancy score.
>
>  <<TableOfContents>>
>
>  = Using FunctionQuery =
>  There are a few ways to use FunctionQuery from Solr's HTTP interface:
> +
>   1. Embed a FunctionQuery in a regular query expressed in SolrQuerySyntax 
> via the _val_ hook
>   1. Use the FunctionQParserPlugin, ie: {{{q={!func}log(foo)}}}
>   1. Use a parameter that has an explicit type of FunctionQuery, such as 
> DisMaxRequestHandler's '''bf''' (boost function) parameter.
> -     * NOTE: the '''bf''' parameter actually takes a list of function 
> queries separated by whitespace and each with an optional boost.  Make sure 
> to eliminate any internal whitespace in single function queries when using 
> '''bf'''.
> +   * NOTE: the '''bf''' parameter actually takes a list of function queries 
> separated by whitespace and each with an optional boost.  Make sure to 
> eliminate any internal whitespace in single function queries when using 
> '''bf'''.
> -     * Example: {{{q=foo&bf="ord(popularity)^0.5 
> recip(rord(price),1,1000,1000)^0.3"}}}
> +   * Example: {{{q=foo&bf="ord(popularity)^0.5 
> recip(rord(price),1,1000,1000)^0.3"}}}
>
>  See SolrPlugins#ValueSourceParser for information on how to hook in your own 
> FunctionQuery.
>
> @@ -20, +21 @@
>
>  There is currently no infix parser - functions must be expressed as function 
> calls (e.g. sum(a,b) instead of a+b)
>
>  = Available Functions =
> -
>  == constant ==
> - <!> [[Solr1.3]]
> - Floating point constants.
> + <!> [[Solr1.3]] Floating point constants.
> +
> -     Example Syntax: '''1.5'''
> +  . Example Syntax: '''1.5'''
> -
> -     SolrQuerySyntax Example: '''_val_:1.5'''
> +  SolrQuerySyntax Example: '''_val_:1.5'''
>
>  == fieldvalue ==
>  This function returns the numeric field value of an indexed field with a 
> maximum of one value per document (not multiValued).  The syntax is simply 
> the field name by itself.  0 is returned for documents without a value in the 
> field.
> +
> -     Example Syntax: '''myFloatField'''
> +  . Example Syntax: '''myFloatField'''
> -
> -     SolrQuerySyntax Example: '''_val_:myFloatField'''
> +  SolrQuerySyntax Example: '''_val_:myFloatField'''
>
>  == ord ==
>  ord(myfield) returns the ordinal of the indexed field value within the 
> indexed list of terms for that field in lucene index order (lexicographically 
> ordered by unicode value), starting at 1. In other words, for a given field, 
> all values are ordered lexicographically; this function then returns the 
> offset of a particular value in that ordering. The field must have a maximum 
> of one value per document (not multiValued).  0 is returned for documents 
> without a value in the field.
> +
> -    Example: If there were only three values for a particular field: 
> "apple","banana","pear", then ord("apple")=1, ord("banana")=2, ord("pear")=3
> +  . Example: If there were only three values for a particular field: 
> "apple","banana","pear", then ord("apple")=1, ord("banana")=2, ord("pear")=3
> -
> -    Example Syntax: '''ord(myIndexedField)'''
> +  Example Syntax: '''ord(myIndexedField)'''
> -
> -    Example SolrQuerySyntax: '''_val_:"ord(myIndexedField)"'''
> +  Example SolrQuerySyntax: '''_val_:"ord(myIndexedField)"'''
>
> + WARNING: as of Solr 1.4, ord() and rord() can cause excess memory use since 
> they must use a FieldCache entry at the top level reader, while sorting and 
> function queries now use entries at the segment level.  Hence sorting or 
> using a different function query, in addition to ord()/rord() will double 
> memory use.
> - WARNING: as of Solr 1.4, ord() and rord() can cause excess memory use since 
> they must use a FieldCache entry
> - at the top level reader, while sorting and function queries now use entries 
> at the segment level.  Hence sorting
> - or using a different function query, in addition to ord()/rord() will 
> double memory use.
> -
>
>  WARNING: ord() depends on the position in an index and can thus change when 
> other documents are inserted or deleted, or if a !MultiSearcher is used.
>
>  == rord ==
>  The reverse ordering of what ord provides.
> +
> -     Example Syntax: '''rord(myIndexedField)'''
> +  . Example Syntax: '''rord(myIndexedField)'''
> -
> -     Example: '''rord(myDateField)''' is a metric for how old a document is: 
> the youngest document will return 1, the oldest document will return the 
> total number of documents.
> +  Example: '''rord(myDateField)''' is a metric for how old a document is: 
> the youngest document will return 1, the oldest document will return the 
> total number of documents.
>
> + WARNING: as of Solr 1.4, ord() and rord() can cause excess memory use since 
> they must use a FieldCache entry at the top level reader, while sorting and 
> function queries now use entries at the segment level.  Hence sorting or 
> using a different function query, in addition to ord()/rord() will double 
> memory use.
> -
> - WARNING: as of Solr 1.4, ord() and rord() can cause excess memory use since 
> they must use a FieldCache entry
> - at the top level reader, while sorting and function queries now use entries 
> at the segment level.  Hence sorting
> - or using a different function query, in addition to ord()/rord() will 
> double memory use.
>
>  == sum ==
> - <!> [[Solr1.3]]
> - sum(x,y,...) returns the sum of multiple functions.
> + <!> [[Solr1.3]] sum(x,y,...) returns the sum of multiple functions.
> +
> -     Example Syntax: '''sum(x,1)'''
> +  . Example Syntax: '''sum(x,1)'''
> -
> -     Example Syntax: '''sum(x,y)'''
> +  Example Syntax: '''sum(x,y)'''
> -
> -     Example Syntax: '''sum(sqrt(x),log(y),z,0.5)'''
> +  Example Syntax: '''sum(sqrt(x),log(y),z,0.5)'''
>
>  == sub ==
> - <!> [[Solr1.4]]
> - sub(x,y) returns x-y
> + <!> [[Solr1.4]] sub(x,y) returns x-y
> +
> -     Example: '''sub(myfield,myfield2)'''
> +  . Example: '''sub(myfield,myfield2)'''
> -
> -     Example: '''sub(100,sqrt(myfield))'''
> +  Example: '''sub(100,sqrt(myfield))'''
>
>  == product ==
> - <!> [[Solr1.3]]
> - product(x,y,...) returns the product of multiple functions.
> + <!> [[Solr1.3]] product(x,y,...) returns the product of multiple functions.
> +
> -     Example Syntax: '''product(x,2)'''
> +  . Example Syntax: '''product(x,2)'''
> -
> -     Example Syntax: '''product(x,y)'''
> +  Example Syntax: '''product(x,y)'''
>
>  == div ==
> - <!> [[Solr1.3]]
> - div(x,y) divides the function x by the function y.
> + <!> [[Solr1.3]] div(x,y) divides the function x by the function y.
> +
> -     Example Syntax: '''div(1,x)'''
> +  . Example Syntax: '''div(1,x)'''
> -
> -     Example Syntax: '''div(sum(x,100),max(y,1))'''
> +  Example Syntax: '''div(sum(x,100),max(y,1))'''
>
>  == pow ==
> - <!> [[Solr1.3]]
> - pow(x,y) raises the base x to the power y.
> + <!> [[Solr1.3]] pow(x,y) raises the base x to the power y.
> +
> -     Example Syntax: '''pow(x,0.5)'''   same as sqrt
> +  . Example Syntax: '''pow(x,0.5)'''   same as sqrt
> -
> -     Example Syntax: '''pow(x,log(y))'''
> +  Example Syntax: '''pow(x,log(y))'''
>
>  == abs ==
> - <!> [[Solr1.3]]
> - abs(x) returns the absolute value of a function.
> + <!> [[Solr1.3]] abs(x) returns the absolute value of a function.
> +
> -     Example Syntax: '''abs(-5)'''
> +  . Example Syntax: '''abs(-5)'''
> -
> -     Example Syntax: '''abs(x)'''
> +  Example Syntax: '''abs(x)'''
>
>  == log ==
> - <!> [[Solr1.3]]
> - log(x) returns log base 10 of the function x.
> + <!> [[Solr1.3]] log(x) returns log base 10 of the function x.
> +
> -     Example Syntax: '''log(x)'''
> +  . Example Syntax: '''log(x)'''
> -
> -     Example Syntax: '''log(sum(x,100))'''
> +  Example Syntax: '''log(sum(x,100))'''
>
>  == sqrt ==
> - <!> [[Solr1.3]]
> - sqrt(x) returns the square root of the function x
> + <!> [[Solr1.3]] sqrt(x) returns the square root of the function x
> +
> -     Example Syntax: '''sqrt(2)'''
> +  . Example Syntax: '''sqrt(2)'''
> -
> -     Example Syntax: '''sqrt(sum(x,100))'''
> +  Example Syntax: '''sqrt(sum(x,100))'''
>
>  == map ==
> - <!> [[Solr1.3]]
> - map(x,min,max,target) maps any values of the function x that fall within 
> min and max inclusive to target.  min,max,target are constants. It outputs 
> the field's value if it does not fall between min and max.
> + <!> [[Solr1.3]] map(x,min,max,target) maps any values of the function x 
> that fall within min and max inclusive to target.  min,max,target are 
> constants. It outputs the field's value if it does not fall between min and 
> max.
> +
> -     Example Syntax 1: '''map(x,0,0,1)'''  change any values of 0 to 1... 
> useful in handling default 0 values
> +  . Example Syntax 1: '''map(x,0,0,1)'''  change any values of 0 to 1... 
> useful in handling default 0 values
> -
> -     Example Syntax 2 <!> [[Solr1.4]]: '''map(x,0,0,1,0)'''  change any 
> values of 0 to 1 . and if the value is not zero it can be set to the value of 
> the 5th argument instead of defaulting to the field's value
> +  Example Syntax 2 <!> [[Solr1.4]]: '''map(x,0,0,1,0)'''  change any values 
> of 0 to 1 . and if the value is not zero it can be set to the value of the 
> 5th argument instead of defaulting to the field's value
> -
> -
> -
>
>  == scale ==
> - <!> [[Solr1.3]]
> - scale(x,minTarget,maxTarget) scales values of the function x such that they 
> fall between minTarget and maxTarget inclusive.
> + <!> [[Solr1.3]] scale(x,minTarget,maxTarget) scales values of the function 
> x such that they fall between minTarget and maxTarget inclusive.
> -     Example Syntax: '''scale(x,1,2)'''  all values will be between 1 and 2 
> inclusive.
>
> -     NOTE: The current implementation currently traverses all of the 
> function values to obtain the min and max so it can pick the correct scale.
> +  . Example Syntax: '''scale(x,1,2)'''  all values will be between 1 and 2 
> inclusive. NOTE: The current implementation currently traverses all of the 
> function values to obtain the min and max so it can pick the correct scale.
> -
> -     NOTE: This implementation currently cannot distinguish when documents 
> have been deleted or documents that have no value, and 0.0 values will be 
> used for these cases.  This means that if values are normally all greater 
> than 0.0, one can still end up with 0.0 as the min value to map from.  In 
> these cases, an appropriate map() function could be used as a workaround to 
> change 0.0 to a value in the real range.  example: 
> '''scale(map(x,0,0,5),1,2)'''
> +  NOTE: This implementation currently cannot distinguish when documents have 
> been deleted or documents that have no value, and 0.0 values will be used for 
> these cases.  This means that if values are normally all greater than 0.0, 
> one can still end up with 0.0 as the min value to map from.  In these cases, 
> an appropriate map() function could be used as a workaround to change 0.0 to 
> a value in the real range.  example: '''scale(map(x,0,0,5),1,2)'''
>
>  == query ==
> - <!> [[Solr1.4]]
> - query(subquery, default) returns the score for the given subquery, or the 
> default value for documents not matching the query.  Any type of subquery is 
> supported through either parameter dereferencing {{{$otherparam}}} or direct 
> specification of the query string in the LocalParams via "v".
> + <!> [[Solr1.4]] query(subquery, default) returns the score for the given 
> subquery, or the default value for documents not matching the query.  Any 
> type of subquery is supported through either parameter dereferencing 
> {{{$otherparam}}} or direct specification of the query string in the 
> LocalParams via "v".
>
> -     Example Syntax: '''q=product(popularity, query({!dismax v='solr 
> rocks'})''' returns the product of the popularity and the score of the dismax 
> query.
> +  . Example Syntax: '''q=product(popularity, query({!dismax v='solr 
> rocks'})''' returns the product of the popularity and the score of the dismax 
> query.
> -
> -     Example Syntax: '''q=product(popularity, query($qq))&qq={!dismax}solr 
> rocks''' is equivalent to the previous query, using param dereferencing.
> +  Example Syntax: '''q=product(popularity, query($qq))&qq={!dismax}solr 
> rocks''' is equivalent to the previous query, using param dereferencing.
> -
> -     Example Syntax: '''q=product(popularity, 
> query($qq,0.1))&qq={!dismax}solr rocks''' specifies a default score of 0.1 
> for documents that don't match the dismax query.
> +  Example Syntax: '''q=product(popularity, query($qq,0.1))&qq={!dismax}solr 
> rocks''' specifies a default score of 0.1 for documents that don't match the 
> dismax query.
>
>  == linear ==
>  linear(x,m,c) implements m*x+c where m and c are constants and x is an 
> arbitrary function.  This is equivalent to '''sum(product(m,x),c)''', but 
> slightly more efficient as it is implemented as a single function.
> +
> -     Example Syntax: '''linear(x,2,4)'''  returns 2*x+4
> +  . Example Syntax: '''linear(x,2,4)'''  returns 2*x+4
>
>  == recip ==
>  A reciprocal function with '''recip(x,m,a,b)''' implementing a/(m*x+b).  
> m,a,b are constants, x is any numeric field or arbitrarily complex function.
>
>  When a and b are equal, and x>=0, this function has a maximum value of 1 
> that drops as x increases. Increasing the value of a and b together results 
> in a movement of the entire function to a flatter part of the curve. These 
> properties can make this an ideal function for boosting more recent documents 
> when x is rord(datefield).
> +
> -     Example Syntax: '''recip(rord(creationDate),1,1000,1000)'''
> +  . Example Syntax: '''recip(rord(creationDate),1,1000,1000)'''
>
> - <!> [[Solr1.4]]
> - In Solr 1.4 and later, best practice is to avoid ord() and rord() and 
> derive the boost directly from the value of the date field.
> + <!> [[Solr1.4]] In Solr 1.4 and later, best practice is to avoid ord() and 
> rord() and derive the boost directly from the value of the date field. See 
> ms() for more details.
> - See ms() for more details.
>
>  == max ==
>  max(x,c) returns the max of another function and a constant.  Useful for 
> "bottoming out" another function at some constant.
> +
> -     Example Syntax: '''max(myfield,0)'''
> +  . Example Syntax: '''max(myfield,0)'''
>
>  == ms ==
>  <!> [[Solr1.4]]
> @@ -175, +149 @@
>
>  Arguments may be numerically indexed date fields such as !TrieDate (the 
> default in 1.4), or date math (examples in SolrQuerySyntax) based on a 
> constant date or '''NOW'''.
>
>  '''ms()'''
> +
> -   Equivalent to '''ms(NOW)''', number of milliseconds since the epoch.
> +  . Equivalent to '''ms(NOW)''', number of milliseconds since the epoch.
> +
>  '''ms(a)'''
> +
> -   Returns the number of milliseconds since the epoch that the argument 
> represents.
> +  . Returns the number of milliseconds since the epoch that the argument 
> represents.
> -
> -   Example: '''ms(NOW/DAY)'''
> +  Example: '''ms(NOW/DAY)'''
> -
> -   Example: '''ms(2000-01-01T00:00:00Z)'''
> +  Example: '''ms(2000-01-01T00:00:00Z)'''
> -
> -   Example: '''ms(mydatefield)'''
> +  Example: '''ms(mydatefield)'''
> +
>  '''ms(a,b)'''
> +
> -   Returns the number of milliseconds that {{{b}}} occurs before {{{a}}} 
> (i.e. {{{a - b}}}).  Note that this offers higher precision than 
> '''sub(a,b)''' because the arguments are not converted to floating point 
> numbers before subtraction.
> +  . Returns the number of milliseconds that {{{b}}} occurs before {{{a}}} 
> (i.e. {{{a - b}}}).  Note that this offers higher precision than 
> '''sub(a,b)''' because the arguments are not converted to floating point 
> numbers before subtraction.
> -
> -   Example: '''ms(NOW,mydatefield)'''
> +  Example: '''ms(NOW,mydatefield)'''
> -
> -   Example: '''ms(mydatefield,2000-01-01T00:00:00Z)'''
> +  Example: '''ms(mydatefield,2000-01-01T00:00:00Z)'''
> -
> -   Example: '''ms(datefield1,datefield2)'''
> +  Example: '''ms(datefield1,datefield2)'''
> +
> + == dist ==
> + [[Solr1.5]]
> +
> + Return the Distance between two Vectors (points) in an n-dimensional space. 
>  See http://en.wikipedia.org/wiki/Lp_space for more information.  Takes in 
> the power, plus two or more !ValueSource instances and calculates the 
> distances between the two vectors.  Each !ValueSource must be a number.
> +
> + Common cases:
> +
> +  ||<tablestyle="width: 467px; height: 88px;">Power ||Common Name ||
> +  ||0 ||Sparseness calculation ||
> +  ||1||Manhattan (taxicab) Distance||
> +  ||2||Euclidean Distance||
> +  ||Infinite||Infinite norm - maximum value in the vector||
> +
> +
>
>  === Date Boosting ===
>  Boosting more recent content is a common use case.  One way is to use a 
> {{{recip}}} function in conjunction with {{{ms}}}.
> @@ -203, +191 @@
>
>  Also see 
> http://wiki.apache.org/solr/SolrRelevancyFAQ#How_can_I_boost_the_score_of_newer_documents
>
>  == top ==
> - <!> [[Solr1.4]]
> - Causes it's function query argument to derive it's values from the 
> top-level IndexReader containing all parts of an index.  For example, the 
> ordinal of a value in a single segment will be different from the ordinal of 
> that same value in the complete index.  The ord() and rord() functions 
> implicitly use top() and hence ord(foo) is equivalent to top(ord(foo)).
> + <!> [[Solr1.4]] Causes it's function query argument to derive it's values 
> from the top-level IndexReader containing all parts of an index.  For 
> example, the ordinal of a value in a single segment will be different from 
> the ordinal of that same value in the complete index.  The ord() and rord() 
> functions implicitly use top() and hence ord(foo) is equivalent to 
> top(ord(foo)).
>
>  = General Example =
> -
> - To give more idea about the use of the function query, suppose index stores 
> dimensions in meters '''x''', '''y''','''z''' of some hypothetical boxes with 
> arbitrary names stored in field '''boxname'''.
> + To give more idea about the use of the function query, suppose index stores 
> dimensions in meters '''x''', '''y''','''z''' of some hypothetical boxes with 
> arbitrary names stored in field '''boxname'''. Suppose we want to search for 
> box matching name ''findbox'' but ranked according to volumes of boxes, the 
> query params would be:
> - Suppose we want to search for box matching name ''findbox'' but ranked 
> according to volumes of boxes, the query params would be:
> +
>  {{{
>    q=boxname:findbox+_val_:"product(product(x,y),z)"
>  }}}
> -
>  Although this will rank the results based on volumes but in order to get the 
> computed volume you will need to add parameter...
> +
>  {{{
>    &fl=*,score
>  }}}
> -
>  ...where '''score''' will contain the resultant volume.
>
>  Suppose you also have a field containing weight of the box as 'weight', then 
> to sort by the density of the box and return the value of the density in 
> score you query should be...
> @@ -226, +211 @@
>
>  {{{
>  http://localhost:8983/solr/select/?q=boxname:findbox+_val_:"div(weight,product(product(x,y),z))"&fl=boxname,x,y,z,weight,score
>  }}}
> -
>
>

Re: [Solr Wiki] Update of "FunctionQuery" by GrantIngersoll

Reply via email to