Ok my bad.
The problem was not in the faceting or querying, but in the method calling
them. A findAll method which was loading the whole dataset (267k docs), by
slices of 1000. So a while executing 267 times a query limited to 1000
results, paginated to restart where the previous one left.
When i saw that, i removed the while, and executed just one ES query,
limited to 100 docs (for UI display in a table, so no need for 1000's of
rows displayed in a single screen, 100 is already too much in my opinion),
and now everything runs extremely smoothly.
Thanks for reading, and be careful of your application logic, before
blaming ES :)
Le mercredi 25 juin 2014 19:11:39 UTC+2, Frederic Esnault a écrit :
>
> Actually it seems it's more when i DESELECT a facet value that performance
> slows down (and load increases). Because i facet and search on all the docs
> (if i remove all filters).
> The more facets and selections i make to restrict my table data, the more
> efficient queries become.
>
> But i'm worried about the fact that on a local dev computer, one node with
> one shard and no replica, with only 265k docs, search/facetting is so
> slow...
> Any clue about what i could be doing wrong to get cpu load to 100 % ?
>
> I was doing like this :
>
> TermsFacetBuilder facet = FacetBuilders.termsFacet(facetName)
> .field(columnName)
> .size(1000);
> SearchRequestBuilder query = client.prepareSearch(datasetName)
> .setTypes(RECORD);
> if (filters != null && filters.size() > 0) {
>
> BoolQueryBuilder queryBuilder = QueryBuilders.boolQuery();
> for (ColumnFilterConfiguration filter : filters) {
> if (filter instanceof TextFilterConfiguration) {
> TextFilterConfiguration textFilter =
> (TextFilterConfiguration) filter;
> final String textFilterValue =
> StringUtils.trimToNull(textFilter.getTextFilterValue());
>
> queryBuilder.must(QueryBuilders.termQuery(filter.getColumn().getNormalizedName(),
>
> textFilterValue));
> } else {
> throw new UnsupportedOperationException("Unsupporterd
> filter type.");
> }
> }
> query.setQuery(queryBuilder);
> } else {
> query.setQuery(QueryBuilders.matchAllQuery());
> }
> query.addFacet(facet);
> SearchResponse sr = query.execute().actionGet();
> TermsFacet f = (TermsFacet)
> sr.getFacets().facetsAsMap().get(facetName);
>
> Now i switched to aggregations, but the result seems to be the same :
>
> TermsBuilder agg =
> AggregationBuilders.terms(facetName).field(columnName).size(1000);
> SearchRequestBuilder query =
> client.prepareSearch().setSize(0).addAggregation(agg);
> if (filters != null && filters.size() > 0) {
> BoolQueryBuilder queryBuilder = QueryBuilders.boolQuery();
> for (ColumnFilterConfiguration filter : filters) {
> if (filter instanceof TextFilterConfiguration) {
> TextFilterConfiguration textFilter =
> (TextFilterConfiguration) filter;
> final String textFilterValue =
> StringUtils.trimToNull(textFilter.getTextFilterValue());
>
> queryBuilder.must(QueryBuilders.termQuery(filter.getColumn().getNormalizedName(),
>
> textFilterValue));
> } else {
> throw new UnsupportedOperationException("Unsupporterd
> filter type.");
> }
> }
> query.setQuery(queryBuilder);
> }
> SearchResponse sr = query.execute().actionGet();
>
> In the last version, i tried not giving any query to avoid a matchAll (if
> no filters defined), and otherwise give it a boolean query matching all
> filters terms.
> I also set the query result size to 0 to avoid fetching data for query
> result, only aggregation result.
> No significative improvement...
>
> Help would be appreciated :)
>
> Le mercredi 25 juin 2014 13:32:56 UTC+2, Frederic Esnault a écrit :
>>
>> Hi all,
>>
>> I'm currently in a development phase, so i'm testing my devs on my local
>> machine (a mac book pro with 16Gb ram, and a 512G SSD, 4 cores).
>>
>> Processor Name: Intel Core i7
>>
>> Processor Speed: 2,3 GHz
>>
>> Number of Processors: 1
>>
>> Total Number of Cores: 4
>>
>> L2 Cache (per Core): 256 KB
>>
>> L3 Cache: 6 MB
>>
>> Memory: 16 GB
>>
>>
>> I set up a one node / one shard / no replica ES node where i index approx
>> 264k documents in bulk mode (takes 16s including preparation time).
>>
>>
>> Then on my application i see a table with my data, and i can create
>> facets on columns.
>>
>> When i have more than one facet, it begins to become difficult for ES.
>>
>> Checking one facet value updates the table data (filtering on the
>> selected facet value), and updates the other facet(s), to get facet values
>> on filtered data only.
>>
>>
>> This is when i select one facet value that ES begins to use 100 % cpu (i
>> even saw 220% in top command result), and i don't really know why.
>> Basically i send 2 queries, one faceting query with filtering the other is
>> the table data query with filters.
>>
>>
>> Do you have any idea what could cause the high cpu load on ES local node
>> ? (For info i gave ES node 5g of memory (Xmx and Xms)).
>>
>
--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/de531ce4-04bb-4a71-871d-a3597dc69a34%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.