Hi Martijn,

Thanks for the response!

Doesn't it take a lot more memory to hold a string field in the FieldCache than 
a long field? 

In our grouping scenario, we have many unique values with a small number of 
documents per group. I would think that even the double FieldCache memory hit 
on a long would be less than using a string. 
 
Would this is a suitable place to have a grouping parameter to control the 
behavior? group.method? I'm looking at using the BlockGroupingCollector as 
well, perhaps "block" could be another choice?
The downside being that there are invalid combinations. (You wouldn’t change 
group.method to anything else if you were using a function to group)

Thanks,
Cody

-----Original Message-----
From: [email protected] [mailto:[email protected]] On Behalf Of 
Martijn v Groningen
Sent: Tuesday, November 29, 2011 2:09 PM
To: [email protected]
Subject: Re: Grouping on Long type uses function query?

If I remember correctly this was done to avoid insane FieldCache usage.

If Term based grouping implementation is used then for that field an entry is 
created in the FieldCache of type DocTermsIndex. It might then happen that for 
other search features like sorting and faceting a second entry is created in 
the FieldCache. Sorting for example will put in your case a new entry for this 
field in the FieldCache of type long. When the Function based grouping 
implementations are used this is not the case. Only one cache entry of type 
long is put in the FieldCache and sorting or faceting will reuse these entries.

The downside of the Function based grouping implementations is that they are 
slower then the Term based implementation.
At the time this feature was integrated into Solr the decision was made to not 
have double FieldCache usage per field and use the slower Function based 
implementation for non string fields.

The work around that doesn't involve coding is the make a copy field of type 
string, but then you add more fields / data to your index...

On 29 November 2011 22:25, Young, Cody <[email protected]> wrote:
> Hi All,
>
>
>
> I’m new to solr development. Since I’m new with the code base, I 
> thought I’d double check here before making a JIRA issue. We’re trying 
> to use grouping on a field with a type of long (on trunk):
>
>     <fieldType name="long" class="solr.TrieLongField" precisionStep="0"
> omitNorms="true" positionIncrementGap="0"/>
>
>
>
> The performance wasn’t what we were looking for so I’m taking a quick 
> look at the grouping code in solr and I noticed that a string field 
> uses the Term grouping classes (CommandField in 
> /trunk/solr/core/src/java/org/apache/solr/search/Grouping.java). 
> However, when using a long field the Function grouping classes get 
> used (CommandFunc in 
> /trunk/solr/core/src/java/org/apache/solr/search/Grouping.java). When 
> I change it over to using CommandField instead of CommandFunc for long 
> type I get a decrease in QTime (I only did light testing, and just simple 
> queries but it seemed to drop by 50% or so).
>
>
>
> The functionality appears to still work and the grouping tests pass, 
> but as I’m not very familiar with the solr code I wasn’t sure if there 
> was a reason for Long to use CommandFunc instead of CommandField.
>
>
>
> I’m happy to take a stab at making a JIRA issue and a patch if this is 
> indeed an issue, but I’ll need some guidance on the best way to fix 
> this (perhaps instead of using instanceof StrFieldSource or instanceof 
> LongFieldSource there is a better way to check?).
>
>
>
> The change I made to test this was very simple, I just added:
>
>
>
> import org.apache.lucene.queries.function.valuesource.LongFieldSource;
>
>
>
> and at Line 176 of Grouping.java
>
>      } else if(valueSource instanceof LongFieldSource) {
>
>          String field = ((LongFieldSource) valueSource).getField();
>
>          CommandField commandField = new CommandField();
>
>          commandField.groupBy = field;
>
>          gc = commandField;
>
>
>
> Thanks,
>
> Cody



--
Met vriendelijke groet,

Martijn van Groningen

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected] For additional 
commands, e-mail: [email protected]

Reply via email to