Ok my bad.

The problem was not in the faceting or querying, but in the method calling 
them. A findAll method which was loading the whole dataset (267k docs), by 
slices of 1000. So a while executing 267 times a query limited to 1000 
results, paginated to restart where the previous one left.
When i saw that, i removed the while, and executed just one ES query, 
limited to 100 docs (for UI display in a table, so no need for 1000's of 
rows displayed in a single screen, 100 is already too much in my opinion), 
and now everything runs extremely smoothly.

Thanks for reading, and be careful of your application logic, before 
blaming ES :) 

Le mercredi 25 juin 2014 19:11:39 UTC+2, Frederic Esnault a écrit :
>
> Actually it seems it's more when i DESELECT a facet value that performance 
> slows down (and load increases). Because i facet and search on all the docs 
> (if i remove all filters).
> The more facets and selections i make to restrict my table data, the more 
> efficient queries become.
>
> But i'm worried about the fact that on a local dev computer, one node with 
> one shard and no replica, with only 265k docs, search/facetting is so 
> slow...
> Any clue about what i could be doing wrong to get cpu load to 100 % ?
>
> I was doing like this :
>
>         TermsFacetBuilder facet = FacetBuilders.termsFacet(facetName)
>                 .field(columnName)
>                 .size(1000);
>         SearchRequestBuilder query = client.prepareSearch(datasetName)
>                 .setTypes(RECORD);
>         if (filters != null && filters.size() > 0) {
>
>             BoolQueryBuilder queryBuilder = QueryBuilders.boolQuery();
>             for (ColumnFilterConfiguration filter : filters) {
>                 if (filter instanceof TextFilterConfiguration) {
>                     TextFilterConfiguration textFilter = 
> (TextFilterConfiguration) filter;
>                     final String textFilterValue = 
> StringUtils.trimToNull(textFilter.getTextFilterValue());
>                     
> queryBuilder.must(QueryBuilders.termQuery(filter.getColumn().getNormalizedName(),
>  
> textFilterValue));
>                 } else {
>                     throw new UnsupportedOperationException("Unsupporterd 
> filter type.");
>                 }
>             }
>             query.setQuery(queryBuilder);
>         } else {
>             query.setQuery(QueryBuilders.matchAllQuery());
>         }
>         query.addFacet(facet);
>         SearchResponse sr = query.execute().actionGet();
>         TermsFacet f = (TermsFacet) 
> sr.getFacets().facetsAsMap().get(facetName);
>
> Now i switched to aggregations, but the result seems to be the same :
>
>         TermsBuilder agg = 
> AggregationBuilders.terms(facetName).field(columnName).size(1000);
>         SearchRequestBuilder query = 
> client.prepareSearch().setSize(0).addAggregation(agg);
>         if (filters != null && filters.size() > 0) {
>             BoolQueryBuilder queryBuilder = QueryBuilders.boolQuery();
>             for (ColumnFilterConfiguration filter : filters) {
>                 if (filter instanceof TextFilterConfiguration) {
>                     TextFilterConfiguration textFilter = 
> (TextFilterConfiguration) filter;
>                     final String textFilterValue = 
> StringUtils.trimToNull(textFilter.getTextFilterValue());
>                     
> queryBuilder.must(QueryBuilders.termQuery(filter.getColumn().getNormalizedName(),
>  
> textFilterValue));
>                 } else {
>                     throw new UnsupportedOperationException("Unsupporterd 
> filter type.");
>                 }
>             }
>             query.setQuery(queryBuilder);
>         }
>         SearchResponse sr = query.execute().actionGet();
>
> In the last version, i tried not giving any query to avoid a matchAll (if 
> no filters defined), and otherwise give it a boolean query matching all 
> filters terms.
> I also set the query result size to 0 to avoid fetching data for query 
> result, only aggregation result.
> No significative improvement...
>
> Help would be appreciated :)
>
> Le mercredi 25 juin 2014 13:32:56 UTC+2, Frederic Esnault a écrit :
>>
>> Hi all,
>>
>> I'm currently in a development phase, so i'm testing my devs  on my local 
>> machine (a mac book pro with 16Gb ram, and a 512G SSD, 4 cores).
>>
>>   Processor Name: Intel Core i7
>>
>>   Processor Speed: 2,3 GHz
>>
>>   Number of Processors: 1
>>
>>   Total Number of Cores: 4
>>
>>   L2 Cache (per Core): 256 KB
>>
>>   L3 Cache: 6 MB
>>
>>   Memory: 16 GB
>>
>>
>> I set up a one node / one shard / no replica ES node where i index approx 
>> 264k documents in bulk mode (takes 16s including preparation time).
>>
>>
>> Then on my application i see a table with my data, and i can create 
>> facets on columns.
>>
>> When i have more than one facet, it begins to become difficult for ES.
>>
>> Checking one facet value updates the table data (filtering on the 
>> selected facet value), and updates the other facet(s), to get facet values 
>> on filtered data only.
>>
>>
>> This is when i select one facet value that ES begins to use 100 % cpu (i 
>> even saw 220% in top command result), and i don't really know why. 
>> Basically i send 2 queries, one faceting query with filtering the other is 
>> the table data query with filters.
>>
>>
>> Do you have any idea what could cause the high cpu load on ES local node 
>> ? (For info i gave ES node 5g of memory (Xmx and Xms)).
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/de531ce4-04bb-4a71-871d-a3597dc69a34%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to