[ 
https://issues.apache.org/jira/browse/SOLR-10123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16068518#comment-16068518
 ] 

Houston Putman edited comment on SOLR-10123 at 6/29/17 5:35 PM:
----------------------------------------------------------------

Okay, so I have updated the cloud and non-cloud schemas to add the randomized 
numeric fields. However the randomized doc-values cannot be used since 
docValues are required for almost all Analytics Component functionality.

Almost all tests pass now, however there is a difference between 
SortedSetDocValues (TrieField) and SortedNumericDocValues (PointField) that 
might make this impossible. SortedSetDocValues only store the unique set of 
values for a multi-valued field, however SortedNumericDocValues can store the 
same value multiple times for a field on the same document. Therefore analytics 
results can vary between the two. 

For an example, if you faceting on {{multi_valued_int_field}} and calculated 
{{sum(float_field)}} on just the following document:
{{Document = ( id="1", multi_valued_int_field=\[1,1,2,2,3\], float_field=3 )}}

If {{multi_valued_int_field}} was a {{IntPointField}}, then the results of the 
facet would be ( {{facet_value : facet_results, ...}} ):
{{1 : ( sum(float_field) = 6 ) , 2 : ( sum(float_field) = 6 ) , 3 : ( 
sum(float_field) = 3 )}}

If {{multi_valued_int_field}} was a {{TrieIntField}}, then the results of the 
facet would be ( {{facet_value : facet_results, ...}} ):
{{1 : ( sum(float_field) = 3 ) , 2 : ( sum(float_field) = 3 ) , 3 : ( 
sum(float_field) = 3 )}}

This isn't included in the unit tests, but the same thing would occur when a 
multi-valued numeric field was used in an expression. The results could be 
different.


was (Author: houstonputman):
Okay, so I have updated the cloud and non-cloud schemas to add the randomized 
numeric fields. However the randomized doc-values cannot be used since 
docValues are required for almost all Analytics Component functionality.

Almost all tests pass now, however there is a difference between 
SortedSetDocValues (TrieField) and SortedNumericDocValues (PointField) that 
might make this impossible. SortedSetDocValues only store the unique set of 
values for a multi-valued field, however SortedNumericDocValues can store the 
same value multiple times for a field on the same document. Therefore analytics 
results can vary between the two. 

For an example, if you faceting on {{multi_valued_int_field}} and calculated 
{{sum(float_field)}} on just the following document:
{{Document = ( id="1", multi_valued_int_field=\[1,1,2,2,3\], float_field=3 )}}

If {{multi_valued_int_field}} was a {{IntPointField}}, then the results of the 
facet would be:
{{1 : ( sum(float_field) = 6 ) , 2 : ( sum(float_field) = 6 ) , 3 : ( 
sum(float_field) = 3 )}}

If {{multi_valued_int_field}} was a {{TrieIntField}}, then the results of the 
facet would be:
{{1 : ( sum(float_field) = 3 ) , 2 : ( sum(float_field) = 3 ) , 3 : ( 
sum(float_field) = 3 )}}

This isn't included in the unit tests, but the same thing would occur when a 
multi-valued numeric field was used in an expression. The results could be 
different.

> Analytics Component 2.0
> -----------------------
>
>                 Key: SOLR-10123
>                 URL: https://issues.apache.org/jira/browse/SOLR-10123
>             Project: Solr
>          Issue Type: New Feature
>      Security Level: Public(Default Security Level. Issues are Public) 
>            Reporter: Houston Putman
>              Labels: features
>         Attachments: SOLR-10123.patch, SOLR-10123.patch, SOLR-10123.patch
>
>
> A completely redesigned Analytics Component, introducing the following 
> features:
> * Support for distributed collections
> * New JSON request language, and response format that fits JSON better.
> * Faceting over mapping functions in addition to fields (Value Faceting)
> * PivotFaceting with ValueFacets
> * More advanced facet sorting
> * Support for PointField types
> * Expressions over multi-valued fields
> * New types of mapping functions
> ** Logical
> ** Conditional
> ** Comparison
> * Concurrent request execution
> * Custom user functions, defined within the request
> Fully backwards compatible with the orifinal Analytics Component with the 
> following exceptions:
> * All fields used must have doc-values enabled
> * Expression results can no longer be used when defining Range and Query 
> facets
> * The reverse(string) mapping function is no longer a native function



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to