[ 
https://issues.apache.org/jira/browse/LUCENE-8635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16753595#comment-16753595
 ] 

Ankit Jain edited comment on LUCENE-8635 at 1/27/19 9:26 PM:
-------------------------------------------------------------

I also independently tried performance run after removing the array reversal in 
readBytes in original patch, but results looked similar to earlier results.

Since, we are leaning towards keep this as optional, I created another patch - 
[^optional_offheap_ra.patch] based off reverse random access reader - 
[^ra.patch], that adds FST.offheap as system property to allow toggling between 
offheap and onheap.

The results for wikimedium10k with:

java ...... -DFST.offheap=true
 
{code}                   TaskQPS baseline      StdDevQPS candidate      StdDev  
              Pct diff
                PKLookup      172.88      (3.3%)      153.94      (3.7%)  
-11.0% ( -17% -   -4%)
                 LowTerm    12229.10      (3.5%)    11032.10      (3.3%)   
-9.8% ( -16% -   -3%)
              AndHighLow     4679.22      (3.2%)     4349.12      (3.3%)   
-7.1% ( -13% -    0%)
                 MedTerm    10179.43      (5.4%)     9533.14      (3.4%)   
-6.3% ( -14% -    2%)
                HighTerm     5123.89      (3.1%)     4814.09      (4.7%)   
-6.0% ( -13% -    1%)
               LowPhrase     3459.57      (5.3%)     3253.20      (7.5%)   
-6.0% ( -17% -    7%)
               MedPhrase     2815.82      (5.1%)     2654.13      (5.6%)   
-5.7% ( -15% -    5%)
             MedSpanNear     2196.98      (4.4%)     2082.39      (3.9%)   
-5.2% ( -12% -    3%)
        HighSloppyPhrase     1680.32      (5.7%)     1592.91      (8.0%)   
-5.2% ( -17% -    9%)
         LowSloppyPhrase     3205.99      (4.9%)     3045.94      (4.4%)   
-5.0% ( -13% -    4%)
               OrHighMed     1960.52      (4.8%)     1866.03      (6.2%)   
-4.8% ( -15% -    6%)
                Wildcard     1388.45      (8.5%)     1324.82      (6.2%)   
-4.6% ( -17% -   11%)
              OrHighHigh     1304.03      (7.8%)     1247.72      (5.1%)   
-4.3% ( -16% -    9%)
              AndHighMed     2268.22      (2.8%)     2171.27      (2.8%)   
-4.3% (  -9% -    1%)
         MedSloppyPhrase     2697.01      (6.1%)     2597.71      (5.0%)   
-3.7% ( -13% -    7%)
   HighTermDayOfYearSort     1719.25      (5.3%)     1657.10      (5.8%)   
-3.6% ( -13% -    7%)
            HighSpanNear     1624.69      (4.4%)     1567.35      (5.6%)   
-3.5% ( -12% -    6%)
             AndHighHigh     1645.28      (3.7%)     1589.76      (2.9%)   
-3.4% (  -9% -    3%)
             LowSpanNear     2319.98      (6.0%)     2246.30      (5.5%)   
-3.2% ( -13% -    8%)
               OrHighLow     2264.00      (6.0%)     2200.33      (4.3%)   
-2.8% ( -12% -    7%)
       HighTermMonthSort     4829.60      (3.9%)     4700.35      (2.5%)   
-2.7% (  -8% -    3%)
                  Fuzzy2      172.46      (4.8%)      168.02      (5.4%)   
-2.6% ( -12% -    8%)
              HighPhrase     2525.60      (6.3%)     2464.09      (5.3%)   
-2.4% ( -13% -    9%)
                  Fuzzy1      585.39      (4.4%)      571.20      (4.1%)   
-2.4% ( -10% -    6%)
                 Prefix3     1359.75      (8.2%)     1330.98      (5.8%)   
-2.1% ( -14% -   12%)
                 Respell      501.29      (3.2%)      490.92      (4.7%)   
-2.1% (  -9% -    5%)
   BrowseMonthTaxoFacets     8450.33      (4.7%)     8354.07      (4.9%)   
-1.1% ( -10% -    8%)
BrowseDayOfYearSSDVFacets     2016.73      (3.4%)     2009.96      (4.0%)   
-0.3% (  -7% -    7%)
BrowseDayOfYearTaxoFacets     8303.67      (6.4%)     8294.91      (5.6%)   
-0.1% ( -11% -   12%)
                  IntNRQ     1380.11      (2.1%)     1380.36      (2.0%)    
0.0% (  -3% -    4%)
    BrowseDateTaxoFacets     3564.47      (3.2%)     3575.88      (3.2%)    
0.3% (  -5% -    7%)
   BrowseMonthSSDVFacets     2247.87      (5.4%)     2276.28      (3.5%)    
1.3% (  -7% -   10%)
{code}

java ...... -DFST.offheap=false

{code}                    TaskQPS baseline      StdDevQPS candidate      StdDev 
               Pct diff
               LowPhrase     3244.01      (6.3%)     3201.30      (7.0%)   
-1.3% ( -13% -   12%)
                PKLookup      171.24      (3.3%)      169.28      (5.3%)   
-1.1% (  -9% -    7%)
         MedSloppyPhrase     2867.58      (6.3%)     2848.80      (6.9%)   
-0.7% ( -13% -   13%)
   BrowseMonthTaxoFacets     8565.92      (4.9%)     8514.51      (5.3%)   
-0.6% ( -10% -   10%)
                 Respell      529.20      (3.6%)      526.69      (3.4%)   
-0.5% (  -7% -    6%)
                Wildcard     1252.25      (7.6%)     1249.97      (7.3%)   
-0.2% ( -13% -   15%)
                  IntNRQ     1536.74      (1.7%)     1536.53      (2.1%)   
-0.0% (  -3% -    3%)
BrowseDayOfYearTaxoFacets     8490.89      (6.3%)     8490.94      (5.5%)    
0.0% ( -11% -   12%)
             LowSpanNear     2391.88      (3.0%)     2392.15      (4.9%)    
0.0% (  -7% -    8%)
                 LowTerm    12382.95      (4.3%)    12384.63      (3.6%)    
0.0% (  -7% -    8%)
       HighTermMonthSort     4906.65      (3.3%)     4910.32      (4.3%)    
0.1% (  -7% -    7%)
             AndHighHigh     1652.60      (5.4%)     1660.85      (4.9%)    
0.5% (  -9% -   11%)
BrowseDayOfYearSSDVFacets     2006.52      (4.5%)     2017.41      (3.3%)    
0.5% (  -6% -    8%)
                  Fuzzy2      176.18      (4.7%)      177.27      (3.9%)    
0.6% (  -7% -    9%)
             MedSpanNear     2668.05      (6.7%)     2688.05      (3.9%)    
0.7% (  -9% -   12%)
                HighTerm     5556.40      (4.9%)     5611.56      (3.8%)    
1.0% (  -7% -   10%)
              AndHighMed     2257.29      (4.7%)     2281.54      (4.0%)    
1.1% (  -7% -   10%)
               OrHighMed     1611.93      (4.5%)     1631.79      (4.0%)    
1.2% (  -6% -   10%)
    BrowseDateTaxoFacets     3521.57      (4.7%)     3565.96      (4.9%)    
1.3% (  -7% -   11%)
                  Fuzzy1      634.59      (3.8%)      642.78      (5.8%)    
1.3% (  -7% -   11%)
              AndHighLow     4739.69      (5.0%)     4813.65      (5.7%)    
1.6% (  -8% -   12%)
   HighTermDayOfYearSort     1742.58      (5.5%)     1770.22      (5.7%)    
1.6% (  -9% -   13%)
   BrowseMonthSSDVFacets     2235.20      (6.4%)     2271.85      (3.4%)    
1.6% (  -7% -   12%)
         LowSloppyPhrase     3167.97      (6.6%)     3221.73      (7.1%)    
1.7% ( -11% -   16%)
                 MedTerm    10275.01      (4.6%)    10450.43      (4.1%)    
1.7% (  -6% -   10%)
                 Prefix3     1522.42      (8.9%)     1551.62      (9.9%)    
1.9% ( -15% -   22%)
            HighSpanNear     1680.39      (5.6%)     1714.25      (5.0%)    
2.0% (  -8% -   13%)
               MedPhrase     2963.75      (7.1%)     3039.31      (5.5%)    
2.5% (  -9% -   16%)
              OrHighHigh     1312.39      (6.2%)     1347.33      (6.1%)    
2.7% (  -9% -   16%)
               OrHighLow     1969.23      (5.9%)     2025.16      (4.4%)    
2.8% (  -7% -   13%)
        HighSloppyPhrase     1256.32      (5.5%)     1296.12      (6.7%)    
3.2% (  -8% -   16%)
              HighPhrase     2202.95      (7.6%)     2311.64      (5.7%)    
4.9% (  -7% -   19%)
{code}


was (Author: akjain):
I also independently tried performance run after removing the array reversal in 
readBytes in original patch, but results looked similar to earlier results.

Since, we are leaning towards keep this as optional, I created another patch - 
[^optional_offheap_ra.patch] based off reverse random access reader - 
[^ra.patch], that adds FST.offheap as system property to allow toggling between 
offheap and onheap.

The results for wikimedium10k with:

java ...... -DFST.offheap=true
 
{code}                   TaskQPS baseline      StdDevQPS candidate      StdDev  
              Pct diff
                PKLookup      172.88      (3.3%)      153.94      (3.7%)  
-11.0% ( -17% -   -4%)
                 LowTerm    12229.10      (3.5%)    11032.10      (3.3%)   
-9.8% ( -16% -   -3%)
              AndHighLow     4679.22      (3.2%)     4349.12      (3.3%)   
-7.1% ( -13% -    0%)
                 MedTerm    10179.43      (5.4%)     9533.14      (3.4%)   
-6.3% ( -14% -    2%)
                HighTerm     5123.89      (3.1%)     4814.09      (4.7%)   
-6.0% ( -13% -    1%)
               LowPhrase     3459.57      (5.3%)     3253.20      (7.5%)   
-6.0% ( -17% -    7%)
               MedPhrase     2815.82      (5.1%)     2654.13      (5.6%)   
-5.7% ( -15% -    5%)
             MedSpanNear     2196.98      (4.4%)     2082.39      (3.9%)   
-5.2% ( -12% -    3%)
        HighSloppyPhrase     1680.32      (5.7%)     1592.91      (8.0%)   
-5.2% ( -17% -    9%)
         LowSloppyPhrase     3205.99      (4.9%)     3045.94      (4.4%)   
-5.0% ( -13% -    4%)
               OrHighMed     1960.52      (4.8%)     1866.03      (6.2%)   
-4.8% ( -15% -    6%)
                Wildcard     1388.45      (8.5%)     1324.82      (6.2%)   
-4.6% ( -17% -   11%)
              OrHighHigh     1304.03      (7.8%)     1247.72      (5.1%)   
-4.3% ( -16% -    9%)
              AndHighMed     2268.22      (2.8%)     2171.27      (2.8%)   
-4.3% (  -9% -    1%)
         MedSloppyPhrase     2697.01      (6.1%)     2597.71      (5.0%)   
-3.7% ( -13% -    7%)
   HighTermDayOfYearSort     1719.25      (5.3%)     1657.10      (5.8%)   
-3.6% ( -13% -    7%)
            HighSpanNear     1624.69      (4.4%)     1567.35      (5.6%)   
-3.5% ( -12% -    6%)
             AndHighHigh     1645.28      (3.7%)     1589.76      (2.9%)   
-3.4% (  -9% -    3%)
             LowSpanNear     2319.98      (6.0%)     2246.30      (5.5%)   
-3.2% ( -13% -    8%)
               OrHighLow     2264.00      (6.0%)     2200.33      (4.3%)   
-2.8% ( -12% -    7%)
       HighTermMonthSort     4829.60      (3.9%)     4700.35      (2.5%)   
-2.7% (  -8% -    3%)
                  Fuzzy2      172.46      (4.8%)      168.02      (5.4%)   
-2.6% ( -12% -    8%)
              HighPhrase     2525.60      (6.3%)     2464.09      (5.3%)   
-2.4% ( -13% -    9%)
                  Fuzzy1      585.39      (4.4%)      571.20      (4.1%)   
-2.4% ( -10% -    6%)
                 Prefix3     1359.75      (8.2%)     1330.98      (5.8%)   
-2.1% ( -14% -   12%)
                 Respell      501.29      (3.2%)      490.92      (4.7%)   
-2.1% (  -9% -    5%)
   BrowseMonthTaxoFacets     8450.33      (4.7%)     8354.07      (4.9%)   
-1.1% ( -10% -    8%)
BrowseDayOfYearSSDVFacets     2016.73      (3.4%)     2009.96      (4.0%)   
-0.3% (  -7% -    7%)
BrowseDayOfYearTaxoFacets     8303.67      (6.4%)     8294.91      (5.6%)   
-0.1% ( -11% -   12%)
                  IntNRQ     1380.11      (2.1%)     1380.36      (2.0%)    
0.0% (  -3% -    4%)
    BrowseDateTaxoFacets     3564.47      (3.2%)     3575.88      (3.2%)    
0.3% (  -5% -    7%)
   BrowseMonthSSDVFacets     2247.87      (5.4%)     2276.28      (3.5%)    
1.3% (  -7% -   10%)
{code}

java ...... -DFST.offheap=false

{{                    TaskQPS baseline      StdDevQPS candidate      StdDev     
           Pct diff
               LowPhrase     3244.01      (6.3%)     3201.30      (7.0%)   
-1.3% ( -13% -   12%)
                PKLookup      171.24      (3.3%)      169.28      (5.3%)   
-1.1% (  -9% -    7%)
         MedSloppyPhrase     2867.58      (6.3%)     2848.80      (6.9%)   
-0.7% ( -13% -   13%)
   BrowseMonthTaxoFacets     8565.92      (4.9%)     8514.51      (5.3%)   
-0.6% ( -10% -   10%)
                 Respell      529.20      (3.6%)      526.69      (3.4%)   
-0.5% (  -7% -    6%)
                Wildcard     1252.25      (7.6%)     1249.97      (7.3%)   
-0.2% ( -13% -   15%)
                  IntNRQ     1536.74      (1.7%)     1536.53      (2.1%)   
-0.0% (  -3% -    3%)
BrowseDayOfYearTaxoFacets     8490.89      (6.3%)     8490.94      (5.5%)    
0.0% ( -11% -   12%)
             LowSpanNear     2391.88      (3.0%)     2392.15      (4.9%)    
0.0% (  -7% -    8%)
                 LowTerm    12382.95      (4.3%)    12384.63      (3.6%)    
0.0% (  -7% -    8%)
       HighTermMonthSort     4906.65      (3.3%)     4910.32      (4.3%)    
0.1% (  -7% -    7%)
             AndHighHigh     1652.60      (5.4%)     1660.85      (4.9%)    
0.5% (  -9% -   11%)
BrowseDayOfYearSSDVFacets     2006.52      (4.5%)     2017.41      (3.3%)    
0.5% (  -6% -    8%)
                  Fuzzy2      176.18      (4.7%)      177.27      (3.9%)    
0.6% (  -7% -    9%)
             MedSpanNear     2668.05      (6.7%)     2688.05      (3.9%)    
0.7% (  -9% -   12%)
                HighTerm     5556.40      (4.9%)     5611.56      (3.8%)    
1.0% (  -7% -   10%)
              AndHighMed     2257.29      (4.7%)     2281.54      (4.0%)    
1.1% (  -7% -   10%)
               OrHighMed     1611.93      (4.5%)     1631.79      (4.0%)    
1.2% (  -6% -   10%)
    BrowseDateTaxoFacets     3521.57      (4.7%)     3565.96      (4.9%)    
1.3% (  -7% -   11%)
                  Fuzzy1      634.59      (3.8%)      642.78      (5.8%)    
1.3% (  -7% -   11%)
              AndHighLow     4739.69      (5.0%)     4813.65      (5.7%)    
1.6% (  -8% -   12%)
   HighTermDayOfYearSort     1742.58      (5.5%)     1770.22      (5.7%)    
1.6% (  -9% -   13%)
   BrowseMonthSSDVFacets     2235.20      (6.4%)     2271.85      (3.4%)    
1.6% (  -7% -   12%)
         LowSloppyPhrase     3167.97      (6.6%)     3221.73      (7.1%)    
1.7% ( -11% -   16%)
                 MedTerm    10275.01      (4.6%)    10450.43      (4.1%)    
1.7% (  -6% -   10%)
                 Prefix3     1522.42      (8.9%)     1551.62      (9.9%)    
1.9% ( -15% -   22%)
            HighSpanNear     1680.39      (5.6%)     1714.25      (5.0%)    
2.0% (  -8% -   13%)
               MedPhrase     2963.75      (7.1%)     3039.31      (5.5%)    
2.5% (  -9% -   16%)
              OrHighHigh     1312.39      (6.2%)     1347.33      (6.1%)    
2.7% (  -9% -   16%)
               OrHighLow     1969.23      (5.9%)     2025.16      (4.4%)    
2.8% (  -7% -   13%)
        HighSloppyPhrase     1256.32      (5.5%)     1296.12      (6.7%)    
3.2% (  -8% -   16%)
              HighPhrase     2202.95      (7.6%)     2311.64      (5.7%)    
4.9% (  -7% -   19%)}}

> Lazy loading Lucene FST offheap using mmap
> ------------------------------------------
>
>                 Key: LUCENE-8635
>                 URL: https://issues.apache.org/jira/browse/LUCENE-8635
>             Project: Lucene - Core
>          Issue Type: New Feature
>          Components: core/FSTs
>         Environment: I used below setup for es_rally tests:
> single node i3.xlarge running ES 6.5
> es_rally was running on another i3.xlarge instance
>            Reporter: Ankit Jain
>            Priority: Major
>         Attachments: fst-offheap-ra-rev.patch, offheap.patch, 
> optional_offheap_ra.patch, ra.patch, rally_benchmark.xlsx
>
>
> Currently, FST loads all the terms into heap memory during index open. This 
> causes frequent JVM OOM issues if the term size gets big. A better way of 
> doing this will be to lazily load FST using mmap. That ensures only the 
> required terms get loaded into memory.
>  
> Lucene can expose API for providing list of fields to load terms offheap. I'm 
> planning to take following approach for this:
>  # Add a boolean property fstOffHeap in FieldInfo
>  # Pass list of offheap fields to lucene during index open (ALL can be 
> special keyword for loading ALL fields offheap)
>  # Initialize the fstOffHeap property during lucene index open
>  # FieldReader invokes default FST constructor or OffHeap constructor based 
> on fstOffHeap field
>  
> I created a patch (that loads all fields offheap), did some benchmarks using 
> es_rally and results look good.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to