Re: DocValue on Strings slow and OOM

Per Steffensen Wed, 06 Nov 2013 02:00:28 -0800

Forget about the quoted comment a the bottom below. It is not true. Boththe fast/efficient and the slow/memory-consuming query follow thegetTermCounts-path.

But I have identified another place where they take different paths inthe code. In SimpleFacets.getTermCounts you will find the code below. Ihave pointed out where the two queries go.

    if (params.getFieldBool(field, GroupParams.GROUP_FACET, false)) {

counts = getGroupedCounts(searcher, docs, field, multiToken,offset,limit, mincount, missing, sort, prefix);

    } else {
      assert method != null;
      switch (method) {
        case ENUM:
          assert TrieField.getMainValuePrefix(ft) == null;

counts = getFacetTermEnumCounts(searcher, docs, field,offset, limit, mincount,missing,sort,prefix);

          break;
        case FCS:
          assert !multiToken;
          if (ft.getNumericType() != null && !sf.multiValued()) {
*** ---> The fast/efficient query (facet.field=a_dlng_doc_sto) goes here
            // force numeric faceting
            if (prefix != null && !prefix.isEmpty()) {

throw new SolrException(ErrorCode.BAD_REQUEST,FacetParams.FACET_PREFIX + " is not supported on numeric types");

counts = NumericFacets.getCounts(searcher, docs, field,offset, limit, mincount, missing, sort);

          } else {

PerSegmentSingleValuedFaceting ps = newPerSegmentSingleValuedFaceting(searcher, docs, field, offset,limit,mincount, missing, sort, prefix);Executor executor = threads == 0 ? directExecutor :facetExecutor;

            ps.setNumThreads(threads);
            counts = ps.getFacetCounts(executor);
          }
          break;
        case FC:
          if (sf.hasDocValues()) {

*** ---> The slow/memory-consuming query (facet.field=c_dstr_doc_sto)goes herecounts = DocValuesFacets.getCounts(searcher, docs, field,offset,limit, mincount, missing, sort, prefix);} else if (multiToken || TrieField.getMainValuePrefix(ft) !=null) {UnInvertedField uif =UnInvertedField.getUnInvertedField(field, searcher);counts = uif.getCounts(searcher, docs, offset, limit,mincount,missing,sort,prefix);

          } else {

counts = getFieldCacheCounts(searcher, docs, field,offset,limit, mincount, missing, sort, prefix);

          }
          break;
        default:
          throw new AssertionError();
      }
    }

I also believe I have found where the huge memory allocation is done.Did a memory dump while the slow/memory-consuming c_dstr_doc_sto-querywas going on (penty of time to do that - 100+ secs). It seems that a lotof memory is allocated under SlowCompositeReaderWrapper.cachedOrdMapswhich holds HashMaps containing MultiDocValues$OrdinalMaps as values,and those MultiDocValues$OrdinalMaps have a field ordDeltas-array ofMonotonicAppendingLongBuffers ... bla bla ... containing Packed64containing long-arrays.Seehttps://dl.dropboxusercontent.com/u/25718039/mem-dump-while-searching-on-facet.field-c_dstr_doc_sto.png

SlowCompositeReaderWrapper and all this memory-allocation does not seemto be part of the fast a_dlng_doc_sto-query.

Does this information provide any leads on how to fixresponse-time/memory-consumption issue? Maybe it helps telling if goingto 4.5 will fix the issue?


Regards, Per Steffensen

On 11/5/13 1:47 PM, Per Steffensen wrote:

Looking at threaddumps
It seems like one of the major differences in what is done forc_dstr_doc_sto vs a_dlng_doc_sto is inSimpleFactes.getFacetFieldCounts, where c_dstr_doc_sto takes the"getTermCounts"-path and a_dlng_doc_sto takes the"getListedTermCounts"-path.
String termList = localParams == null ? null :localParams.get(CommonParams.TERMS);
            if (termList != null) {
              res.add(key, getListedTermCounts(facetValue, termList));
            } else {
              res.add(key, getTermCounts(facetValue));
            }
getTermCounts seems to do a lot more and to be a lot more complex thangetListedTermCounts

Re: DocValue on Strings slow and OOM

Reply via email to