[ https://issues.apache.org/jira/browse/LUCENE-8635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16744344#comment-16744344 ]
Mike Sokolov edited comment on LUCENE-8635 at 1/16/19 6:54 PM: --------------------------------------------------------------- Following a suggestion from [~mikemccand] I tried a slightly different version of this, making use of randomAccessSlice to avoid some calls to seek(), and this gives better perf in the benchmarks. I also spent some time trying to understand FST's backwards-seeking behavior. Based on my crude understanding, and comment from Mike again, it seems as if with some work it would be possible to make it more naturally forward-seeking, but it's not obvious that in general you would get more local cache-friendly access patterns from that. Still you might; probably needs some experimentation to know for sure. Here are the benchmark #s from the random-access patch: {noformat} Task QPS before StdDev QPS after StdDev Pct diff PKLookup 133.62 (2.2%) 123.74 (1.5%) -7.4% ( -10% - -3%) AndHighLow 3411.49 (3.2%) 3268.04 (3.1%) -4.2% ( -10% - 2%) BrowseDayOfYearTaxoFacets 10067.18 (4.3%) 9828.65 (3.5%) -2.4% ( -9% - 5%) LowTerm 3567.48 (1.2%) 3489.27 (1.7%) -2.2% ( -5% - 0%) Fuzzy1 147.67 (3.1%) 144.65 (2.4%) -2.0% ( -7% - 3%) BrowseMonthTaxoFacets 10102.27 (4.2%) 9901.49 (4.1%) -2.0% ( -9% - 6%) Fuzzy2 62.00 (2.8%) 60.87 (2.4%) -1.8% ( -6% - 3%) MedTerm 2694.87 (2.0%) 2647.08 (2.1%) -1.8% ( -5% - 2%) AndHighMed 1171.52 (2.7%) 1154.25 (2.8%) -1.5% ( -6% - 4%) HighTerm 2061.53 (2.3%) 2032.84 (2.5%) -1.4% ( -6% - 3%) MedSloppyPhrase 266.60 (3.4%) 263.01 (4.2%) -1.3% ( -8% - 6%) OrHighHigh 278.90 (4.0%) 275.35 (4.7%) -1.3% ( -9% - 7%) HighSloppyPhrase 107.68 (5.5%) 106.34 (5.6%) -1.2% ( -11% - 10%) Respell 118.26 (2.1%) 116.95 (2.2%) -1.1% ( -5% - 3%) AndHighHigh 472.93 (4.4%) 467.78 (3.3%) -1.1% ( -8% - 6%) OrHighMed 755.21 (2.9%) 748.34 (3.3%) -0.9% ( -6% - 5%) MedSpanNear 308.31 (3.3%) 305.59 (3.8%) -0.9% ( -7% - 6%) Wildcard 869.37 (3.5%) 862.74 (1.9%) -0.8% ( -5% - 4%) HighTermMonthSort 871.33 (7.1%) 865.80 (6.1%) -0.6% ( -12% - 13%) MedPhrase 449.39 (3.0%) 446.55 (2.4%) -0.6% ( -5% - 4%) LowSpanNear 391.10 (3.3%) 388.77 (3.8%) -0.6% ( -7% - 6%) LowSloppyPhrase 406.57 (3.8%) 404.23 (3.6%) -0.6% ( -7% - 7%) HighPhrase 239.84 (3.7%) 238.78 (3.3%) -0.4% ( -7% - 6%) Prefix3 1230.56 (5.0%) 1225.52 (2.9%) -0.4% ( -7% - 7%) HighSpanNear 107.34 (5.2%) 107.20 (5.3%) -0.1% ( -10% - 10%) LowPhrase 438.52 (3.4%) 438.14 (2.5%) -0.1% ( -5% - 5%) BrowseDateTaxoFacets 11.14 (4.0%) 11.16 (7.0%) 0.2% ( -10% - 11%) HighTermDayOfYearSort 606.85 (6.7%) 608.65 (5.4%) 0.3% ( -11% - 13%) IntNRQ 987.08 (12.5%) 990.96 (13.5%) 0.4% ( -22% - 30%) OrHighLow 553.72 (3.2%) 558.09 (3.5%) 0.8% ( -5% - 7%) BrowseDayOfYearSSDVFacets 38.23 (3.9%) 38.66 (4.1%) 1.1% ( -6% - 9%) BrowseMonthSSDVFacets 42.05 (3.5%) 42.57 (3.7%) 1.2% ( -5% - 8%) {noformat} was (Author: sokolov): Following a suggestion from ~mikemccand I tried a slightly different version of this, making use of randomAccessSlice to avoid some calls to seek(), and this gives better perf in the benchmarks. I also spent some time trying to understand FST's backwards-seeking behavior. Based on my crude understanding, and comment from Mike again, it seems as if with some work it would be possible to make it more naturally forward-seeking, but it's not obvious that in general you would get more local cache-friendly access patterns from that. Still you might; probably needs some experimentation to know for sure. Here are the benchmark #s from the random-access patch: {noformat} Task QPS before StdDev QPS after StdDev Pct diff PKLookup 133.62 (2.2%) 123.74 (1.5%) -7.4% ( -10% - -3%) AndHighLow 3411.49 (3.2%) 3268.04 (3.1%) -4.2% ( -10% - 2%) BrowseDayOfYearTaxoFacets 10067.18 (4.3%) 9828.65 (3.5%) -2.4% ( -9% - 5%) LowTerm 3567.48 (1.2%) 3489.27 (1.7%) -2.2% ( -5% - 0%) Fuzzy1 147.67 (3.1%) 144.65 (2.4%) -2.0% ( -7% - 3%) BrowseMonthTaxoFacets 10102.27 (4.2%) 9901.49 (4.1%) -2.0% ( -9% - 6%) Fuzzy2 62.00 (2.8%) 60.87 (2.4%) -1.8% ( -6% - 3%) MedTerm 2694.87 (2.0%) 2647.08 (2.1%) -1.8% ( -5% - 2%) AndHighMed 1171.52 (2.7%) 1154.25 (2.8%) -1.5% ( -6% - 4%) HighTerm 2061.53 (2.3%) 2032.84 (2.5%) -1.4% ( -6% - 3%) MedSloppyPhrase 266.60 (3.4%) 263.01 (4.2%) -1.3% ( -8% - 6%) OrHighHigh 278.90 (4.0%) 275.35 (4.7%) -1.3% ( -9% - 7%) HighSloppyPhrase 107.68 (5.5%) 106.34 (5.6%) -1.2% ( -11% - 10%) Respell 118.26 (2.1%) 116.95 (2.2%) -1.1% ( -5% - 3%) AndHighHigh 472.93 (4.4%) 467.78 (3.3%) -1.1% ( -8% - 6%) OrHighMed 755.21 (2.9%) 748.34 (3.3%) -0.9% ( -6% - 5%) MedSpanNear 308.31 (3.3%) 305.59 (3.8%) -0.9% ( -7% - 6%) Wildcard 869.37 (3.5%) 862.74 (1.9%) -0.8% ( -5% - 4%) HighTermMonthSort 871.33 (7.1%) 865.80 (6.1%) -0.6% ( -12% - 13%) MedPhrase 449.39 (3.0%) 446.55 (2.4%) -0.6% ( -5% - 4%) LowSpanNear 391.10 (3.3%) 388.77 (3.8%) -0.6% ( -7% - 6%) LowSloppyPhrase 406.57 (3.8%) 404.23 (3.6%) -0.6% ( -7% - 7%) HighPhrase 239.84 (3.7%) 238.78 (3.3%) -0.4% ( -7% - 6%) Prefix3 1230.56 (5.0%) 1225.52 (2.9%) -0.4% ( -7% - 7%) HighSpanNear 107.34 (5.2%) 107.20 (5.3%) -0.1% ( -10% - 10%) LowPhrase 438.52 (3.4%) 438.14 (2.5%) -0.1% ( -5% - 5%) BrowseDateTaxoFacets 11.14 (4.0%) 11.16 (7.0%) 0.2% ( -10% - 11%) HighTermDayOfYearSort 606.85 (6.7%) 608.65 (5.4%) 0.3% ( -11% - 13%) IntNRQ 987.08 (12.5%) 990.96 (13.5%) 0.4% ( -22% - 30%) OrHighLow 553.72 (3.2%) 558.09 (3.5%) 0.8% ( -5% - 7%) BrowseDayOfYearSSDVFacets 38.23 (3.9%) 38.66 (4.1%) 1.1% ( -6% - 9%) BrowseMonthSSDVFacets 42.05 (3.5%) 42.57 (3.7%) 1.2% ( -5% - 8%) {noformat} > Lazy loading Lucene FST offheap using mmap > ------------------------------------------ > > Key: LUCENE-8635 > URL: https://issues.apache.org/jira/browse/LUCENE-8635 > Project: Lucene - Core > Issue Type: New Feature > Components: core/FSTs > Environment: I used below setup for es_rally tests: > single node i3.xlarge running ES 6.5 > es_rally was running on another i3.xlarge instance > Reporter: Ankit Jain > Priority: Major > Attachments: offheap.patch, ra.patch, rally_benchmark.xlsx > > > Currently, FST loads all the terms into heap memory during index open. This > causes frequent JVM OOM issues if the term size gets big. A better way of > doing this will be to lazily load FST using mmap. That ensures only the > required terms get loaded into memory. > > Lucene can expose API for providing list of fields to load terms offheap. I'm > planning to take following approach for this: > # Add a boolean property fstOffHeap in FieldInfo > # Pass list of offheap fields to lucene during index open (ALL can be > special keyword for loading ALL fields offheap) > # Initialize the fstOffHeap property during lucene index open > # FieldReader invokes default FST constructor or OffHeap constructor based > on fstOffHeap field > > I created a patch (that loads all fields offheap), did some benchmarks using > es_rally and results look good. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org