[ 
https://issues.apache.org/jira/browse/LUCENE-5077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13692837#comment-13692837
 ] 

Adrien Grand commented on LUCENE-5077:
--------------------------------------

I ran again the WIKI_MEDIUM_1M benchmark with various norms formats, and 
Lucene42NormsFormat with PackedInts.DEFAULT doesn't look bad:

{noformat}
Default norms format: 1991830 bytes of norms

Lucene42NormsFormat(PackedInts.DEFAULT) 909910 bytes of norms

                    Task   QPS trunk      StdDevQPS packed norms      StdDev    
            Pct diff
                HighTerm      758.15      (6.4%)      643.01      (7.5%)  
-15.2% ( -27% -   -1%)
              OrHighHigh      296.86     (10.3%)      280.84     (10.6%)   
-5.4% ( -23% -   17%)
               OrHighMed      218.24     (10.7%)      209.35     (10.9%)   
-4.1% ( -23% -   19%)
                  Fuzzy2      140.18      (4.0%)      135.14      (5.3%)   
-3.6% ( -12% -    5%)
                 MedTerm     1578.99      (7.4%)     1546.60      (4.8%)   
-2.1% ( -13% -   10%)
              HighPhrase      160.42      (6.6%)      157.22      (4.0%)   
-2.0% ( -11% -    9%)
               OrHighLow      552.01      (9.9%)      543.15     (10.8%)   
-1.6% ( -20% -   21%)
                PKLookup      386.15      (5.4%)      382.35      (4.5%)   
-1.0% ( -10% -    9%)
             MedSpanNear      135.61      (3.5%)      134.41      (4.1%)   
-0.9% (  -8% -    7%)
            HighSpanNear       10.72      (3.2%)       10.63      (2.2%)   
-0.8% (  -6% -    4%)
        HighSloppyPhrase       47.29      (4.3%)       47.09      (5.0%)   
-0.4% (  -9% -    9%)
             LowSpanNear       63.62      (3.4%)       63.83      (4.1%)    
0.3% (  -6% -    8%)
                 Respell      117.48      (4.8%)      118.03      (4.2%)    
0.5% (  -8% -    9%)
                Wildcard      288.18      (4.0%)      289.88      (4.3%)    
0.6% (  -7% -    9%)
             AndHighHigh      478.72      (3.7%)      481.87      (3.2%)    
0.7% (  -6% -    7%)
                 Prefix3     1399.57      (3.8%)     1410.64      (6.0%)    
0.8% (  -8% -   10%)
         MedSloppyPhrase      233.10      (3.8%)      235.37      (4.2%)    
1.0% (  -6% -    9%)
              AndHighMed      751.65      (3.7%)      759.12      (4.7%)    
1.0% (  -7% -    9%)
               MedPhrase      119.14      (5.2%)      120.52      (4.7%)    
1.2% (  -8% -   11%)
                  Fuzzy1      142.29      (3.7%)      144.50      (4.5%)    
1.6% (  -6% -   10%)
              AndHighLow     2365.88      (6.6%)     2407.32      (4.7%)    
1.8% (  -8% -   13%)
               LowPhrase      256.84      (4.3%)      262.04      (2.6%)    
2.0% (  -4% -    9%)
         LowSloppyPhrase      313.62      (2.9%)      321.21      (3.5%)    
2.4% (  -3% -    9%)
                  IntNRQ      117.27      (7.1%)      121.22     (11.0%)    
3.4% ( -13% -   23%)
                 LowTerm     2760.64      (4.5%)     2907.64      (6.8%)    
5.3% (  -5% -   17%)



Lucene42NormsFormat(PackedInts.DEFAULT) 896406 bytes of norms
                    
                    Task   QPS trunk      StdDevQPS packed norms      StdDev    
            Pct diff
                HighTerm      698.74      (9.5%)      607.43      (8.0%)  
-13.1% ( -27% -    4%)
              OrHighHigh      247.01      (6.3%)      216.49      (5.8%)  
-12.4% ( -23% -    0%)
               OrHighMed      339.84      (6.1%)      301.83      (7.1%)  
-11.2% ( -23% -    2%)
               OrHighLow      385.26      (5.6%)      342.81      (7.5%)  
-11.0% ( -22% -    2%)
                 MedTerm     1100.36     (10.0%)      983.30      (7.5%)  
-10.6% ( -25% -    7%)
              HighPhrase      181.74      (8.1%)      176.96      (5.9%)   
-2.6% ( -15% -   12%)
                  Fuzzy1      157.29      (5.1%)      154.49      (4.7%)   
-1.8% ( -10% -    8%)
            HighSpanNear       34.67      (3.6%)       34.13      (2.5%)   
-1.5% (  -7% -    4%)
                 Prefix3      437.45      (6.1%)      431.17      (6.0%)   
-1.4% ( -12% -   11%)
        HighSloppyPhrase        5.96      (4.1%)        5.91      (2.7%)   
-0.8% (  -7% -    6%)
         MedSloppyPhrase      264.84      (4.2%)      262.92      (4.9%)   
-0.7% (  -9% -    8%)
                 Respell      194.30      (5.8%)      192.95      (4.3%)   
-0.7% ( -10% -    9%)
               MedPhrase      132.99      (5.6%)      132.37      (5.2%)   
-0.5% ( -10% -   10%)
                Wildcard      235.47      (4.8%)      235.00      (4.5%)   
-0.2% (  -9% -    9%)
             AndHighHigh      338.04      (3.3%)      337.96      (2.4%)   
-0.0% (  -5% -    5%)
               LowPhrase      353.22      (6.9%)      353.80      (5.3%)    
0.2% ( -11% -   13%)
             LowSpanNear       79.68      (3.6%)       79.98      (4.5%)    
0.4% (  -7% -    8%)
                  Fuzzy2       79.15      (6.6%)       79.49      (5.6%)    
0.4% ( -11% -   13%)
                PKLookup      387.23      (6.7%)      389.36      (4.5%)    
0.5% ( -10% -   12%)
         LowSloppyPhrase      649.88      (2.7%)      655.05      (4.2%)    
0.8% (  -5% -    7%)
                  IntNRQ      191.57      (7.7%)      195.08      (9.8%)    
1.8% ( -14% -   20%)
              AndHighLow     2025.29      (7.1%)     2065.03      (6.4%)    
2.0% ( -10% -   16%)
             MedSpanNear      415.85      (4.5%)      426.71      (4.0%)    
2.6% (  -5% -   11%)
              AndHighMed      956.96      (5.4%)      990.30      (6.6%)    
3.5% (  -8% -   16%)
                 LowTerm     2644.68      (7.4%)     2745.68      (8.1%)    
3.8% ( -10% -   20%)

DiskNormsFormat (same as DiskDVF but for norms): 896314 bytes of norms

                    Task   QPS trunk      StdDevQPS packed norms      StdDev    
            Pct diff
                HighTerm      359.42     (12.9%)      204.00      (2.5%)  
-43.2% ( -51% -  -32%)
              OrHighHigh      269.86      (7.4%)      177.72      (4.1%)  
-34.1% ( -42% -  -24%)
               OrHighLow      358.36      (8.1%)      238.59      (4.1%)  
-33.4% ( -42% -  -23%)
               OrHighMed      305.65      (8.6%)      207.21      (4.7%)  
-32.2% ( -41% -  -20%)
                 MedTerm     1342.66      (9.2%)      913.30      (3.4%)  
-32.0% ( -40% -  -21%)
                 LowTerm     2849.62     (10.9%)     2449.59      (5.4%)  
-14.0% ( -27% -    2%)
             AndHighHigh      278.22      (3.8%)      249.40      (2.4%)  
-10.4% ( -15% -   -4%)
              HighPhrase      141.20      (6.5%)      131.19      (4.3%)   
-7.1% ( -16% -    3%)
              AndHighMed      410.39      (3.5%)      399.99      (3.1%)   
-2.5% (  -8% -    4%)
            HighSpanNear       42.28      (2.7%)       41.21      (2.8%)   
-2.5% (  -7% -    3%)
              AndHighLow     1932.50      (8.4%)     1895.71      (8.0%)   
-1.9% ( -16% -   15%)
                  Fuzzy1      171.83      (4.0%)      168.69      (4.3%)   
-1.8% (  -9% -    6%)
                  Fuzzy2       47.29      (4.1%)       46.75      (3.1%)   
-1.1% (  -7% -    6%)
                Wildcard      441.76      (4.8%)      437.28      (4.8%)   
-1.0% ( -10% -    8%)
                 Respell      133.99      (3.7%)      132.66      (2.8%)   
-1.0% (  -7% -    5%)
                  IntNRQ      125.99      (8.7%)      125.24      (7.5%)   
-0.6% ( -15% -   17%)
             MedSpanNear      107.53      (3.2%)      107.04      (4.9%)   
-0.5% (  -8% -    7%)
                 Prefix3      570.56      (4.7%)      568.06      (4.9%)   
-0.4% (  -9% -    9%)
         MedSloppyPhrase      247.61      (4.4%)      249.33      (3.6%)    
0.7% (  -7% -    9%)
               LowPhrase      223.67      (3.7%)      225.77      (3.9%)    
0.9% (  -6% -    8%)
        HighSloppyPhrase       46.13      (4.8%)       46.68      (5.9%)    
1.2% (  -9% -   12%)
                PKLookup      381.14      (2.5%)      385.72      (4.3%)    
1.2% (  -5% -    8%)
             LowSpanNear      109.87      (3.6%)      111.83      (4.7%)    
1.8% (  -6% -   10%)
         LowSloppyPhrase      179.23      (3.3%)      184.36      (4.2%)    
2.9% (  -4% -   10%)
               MedPhrase      202.33      (3.0%)      208.91      (4.0%)    
3.3% (  -3% -   10%)
{noformat}
                
> make it easier to use compressed norms
> --------------------------------------
>
>                 Key: LUCENE-5077
>                 URL: https://issues.apache.org/jira/browse/LUCENE-5077
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Michael McCandless
>             Fix For: 5.0, 4.4
>
>         Attachments: LUCENE-5077.patch
>
>
> Lucene42DVConsumer's ctor takes acceptableOverheadRatio, so that you can 
> tradeoff time/space, and we pass PackedInts.FASTEST so we always use 8 bits 
> per value.
> But the class is package private, so if I want to make my own NormsFormat and 
> pass e.g. PackedInts.COMPACT, I can't ... I think we should make this class 
> public / @experimental?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to