Hi!
On 01.07.22 00:46, Greg Miller wrote:
Have you considered taxonomy faceting for your use-case? Because the
taxonomy structure is maintained in a separate index, it's
(relatively) trivial to iterate all direct child ordinals of a given
dimension. The cost of mapping to a global ordinal space is done when
the index is merged.
Thanks for the tip. I will certainly look into it.
Separately, I'd be curious about where you're running into performance
issues within the context of your system. Is the cost you're concerned
with building up the ordinal map? That's certainly expensive, but it's
a one-time cost (until you refresh your index).
That's not the problem. The index changes rarely during operation, anyways.
Or are you concerned
with the actual map lookup within your tight loop?
Yes. The index I tested has about 3M documents and overall about 180M
doc value ords. Just iterating, without retrieving the actual values or
building the result map takes close to 3s. All the time seems to be
spent in SortedSetDocValues.nextOrd and LongValues.get.
If the latter, you
could consider doing more work at the slice-level by separately
determining the child ords for each dim ord within the context of each
segment (there's no off-the-shelf code for this that I'm aware of, so
you'd have to roll your own).
I was thinking about this as well. So lets say, I have the ord ranges
for dimensions per segment. I guess, I could easily test this by making
sure, there is only one segment, so I can use the global ord ranges I
already got. What would be the best way to jump to those ords for each
document? If I use SortedSetDocValues, I'd still had to iterator through
all ords per document, right? Maybe that's not really the problem. I've
tried to time this, but both the profiler and hand crafted timing code
massively skew the results, so I'm not sure, I trust those measurements.
Cheers
harry
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org