[jira] [Comment Edited] (LUCENE-8635) Lazy loading Lucene FST offheap using mmap
[ https://issues.apache.org/jira/browse/LUCENE-8635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16764370#comment-16764370 ] Ankit Jain edited comment on LUCENE-8635 at 2/10/19 9:51 AM: - I added print statements while running the benchmarks, and the classification looks correct: {code} Initializing field offheap start=55 field=Date.taxonomy Initializing field offheap start=76 field=DayOfYear.sortedset Initializing field offheap start=97 field=Month.sortedset Initializing field offheap start=118 field=body Initializing field onheap start=267 field=date Initializing field onheap start=289 field=groupend Initializing field onheap start=311 field=id Initializing field onheap start=333 field=title {code} Though, when I restricted tests to PKLookups only using comp.addTaskPattern('PKLookup') in localrun.py, results look as expected: {code:title=wikimedium10k|borderStyle=solid} TaskQPS baseline StdDevQPS candidate StdDev Pct diff PKLookup 163.29(1.6%) 164.80 (2.1%) 0.9% (-2% - 4%) {code} {code:title=wikimedium10m|borderStyle=solid} TaskQPS baseline StdDevQPS candidateStdDev Pct diff PKLookup 114.29(1.7%) 114.73 (1.2%) 0.4% ( -2% - 3%) {code} It seems we are good with this change then. was (Author: akjain): I added print statements while running the benchmarks, and the classification looks correct: ``` Initializing field offheap start=55 field=Date.taxonomy Initializing field offheap start=76 field=DayOfYear.sortedset Initializing field offheap start=97 field=Month.sortedset Initializing field offheap start=118 field=body Initializing field onheap start=267 field=date Initializing field onheap start=289 field=groupend Initializing field onheap start=311 field=id Initializing field onheap start=333 field=title ``` Though, when I restricted tests to PKLookups only using comp.addTaskPattern('PKLookup') in localrun.py, results look as expected: ``` wikimedium10k TaskQPS baseline StdDevQPS candidate StdDev Pct diff PKLookup 163.29(1.6%) 164.80 (2.1%) 0.9% (-2% - 4%) ``` ``` wikimedium10m TaskQPS baseline StdDevQPS candidateStdDev Pct diff PKLookup 114.29(1.7%) 114.73 (1.2%) 0.4% ( -2% - 3%) ``` I guess we are good then. > Lazy loading Lucene FST offheap using mmap > -- > > Key: LUCENE-8635 > URL: https://issues.apache.org/jira/browse/LUCENE-8635 > Project: Lucene - Core > Issue Type: New Feature > Components: core/FSTs > Environment: I used below setup for es_rally tests: > single node i3.xlarge running ES 6.5 > es_rally was running on another i3.xlarge instance >Reporter: Ankit Jain >Priority: Major > Attachments: fst-offheap-ra-rev.patch, fst-offheap-rev.patch, > offheap.patch, optional_offheap_ra.patch, ra.patch, rally_benchmark.xlsx > > > Currently, FST loads all the terms into heap memory during index open. This > causes frequent JVM OOM issues if the term size gets big. A better way of > doing this will be to lazily load FST using mmap. That ensures only the > required terms get loaded into memory. > > Lucene can expose API for providing list of fields to load terms offheap. I'm > planning to take following approach for this: > # Add a boolean property fstOffHeap in FieldInfo > # Pass list of offheap fields to lucene during index open (ALL can be > special keyword for loading ALL fields offheap) > # Initialize the fstOffHeap property during lucene index open > # FieldReader invokes default FST constructor or OffHeap constructor based > on fstOffHeap field > > I created a patch (that loads all fields offheap), did some benchmarks using > es_rally and results look good. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-8635) Lazy loading Lucene FST offheap using mmap
[ https://issues.apache.org/jira/browse/LUCENE-8635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16760118#comment-16760118 ] Ankit Jain edited comment on LUCENE-8635 at 2/4/19 7:35 PM: I have created [pull request|https://github.com/apache/lucene-solr/pull/563] with the proposed changes. Though surprisingly, I still see some impact on the PKLookup performance. This does not make sense to me, might be my perf run setup. {code:title=wikimedium10m|borderStyle=solid} TaskQPS baseline StdDevQPS candidate StdDev Pct diff PKLookup 117.45 (2.2%) 108.72 (2.3%) -7.4% ( -11% - -3%) OrHighNotMed 1094.23 (2.5%) 1057.88 (2.7%) -3.3% ( -8% -1%) OrHighNotLow 1047.30 (1.7%) 1012.91 (2.5%) -3.3% ( -7% -1%) Fuzzy2 44.10 (2.3%) 42.71 (2.7%) -3.2% ( -7% -1%) OrNotHighLow 1022.67 (2.5%) 992.28 (2.4%) -3.0% ( -7% -1%) BrowseDayOfYearTaxoFacets 7907.19 (2.0%) 7677.99 (2.7%) -2.9% ( -7% -1%) OrNotHighMed 866.37 (1.9%) 843.10 (2.3%) -2.7% ( -6% -1%) LowTerm 2103.58 (3.5%) 2048.98 (3.6%) -2.6% ( -9% -4%) BrowseMonthTaxoFacets 7883.86 (2.0%) 7692.48 (2.1%) -2.4% ( -6% -1%) Fuzzy1 64.44 (1.9%) 62.88 (2.3%) -2.4% ( -6% -1%) OrNotHighHigh 779.27 (2.0%) 761.04 (2.1%) -2.3% ( -6% -1%) Respell 55.60 (2.6%) 54.34 (2.3%) -2.3% ( -7% -2%) OrHighNotHigh 877.28 (2.2%) 858.10 (2.5%) -2.2% ( -6% -2%) BrowseMonthSSDVFacets 14.85 (7.9%) 14.57 (10.7%) -1.9% ( -18% - 18%) MedTerm 1984.26 (3.6%) 1947.76 (2.3%) -1.8% ( -7% -4%) AndHighLow 718.71 (1.5%) 706.06 (1.6%) -1.8% ( -4% -1%) OrHighLow 523.40 (2.5%) 515.56 (2.4%) -1.5% ( -6% -3%) HighTerm 1381.10 (2.9%) 1360.80 (2.7%) -1.5% ( -6% -4%) HighTermMonthSort 120.45 (12.3%) 119.00 (16.4%) -1.2% ( -26% - 31%) BrowseDayOfYearSSDVFacets 11.55 (9.7%) 11.45 (10.0%) -0.8% ( -18% - 20%) AndHighMed 155.15 (2.6%) 154.25 (2.4%) -0.6% ( -5% -4%) OrHighMed 88.00 (2.5%) 87.85 (2.7%) -0.2% ( -5% -5%) LowPhrase 80.53 (1.6%) 80.40 (1.4%) -0.2% ( -3% -2%) AndHighHigh 41.91 (4.2%) 41.86 (2.9%) -0.1% ( -6% -7%) MedPhrase 46.29 (1.4%) 46.33 (1.5%) 0.1% ( -2% -3%) IntNRQ 127.54 (0.4%) 127.76 (0.4%) 0.2% ( 0% -1%) HighTermDayOfYearSort 48.59 (5.1%) 48.71 (6.0%) 0.2% ( -10% - 12%) LowSloppyPhrase 13.04 (4.0%) 13.08 (4.3%) 0.3% ( -7% -8%) MedSloppyPhrase 19.48 (2.3%) 19.54 (2.4%) 0.3% ( -4% -5%) OrHighHigh 23.60 (3.0%) 23.68 (2.9%) 0.3% ( -5% -6%) HighPhrase 20.25 (2.4%) 20.32 (1.8%) 0.3% ( -3% -4%) HighSloppyPhrase9.29 (3.3%)9.32 (3.2%) 0.4% ( -5% -7%) LowSpanNear 25.70 (3.8%) 25.89 (3.9%) 0.7% ( -6% -8%) MedSpanNear 30.46 (4.1%) 30.69 (4.3%) 0.7% ( -7% -9%) HighSpanNear 14.41 (4.3%) 14.60 (4.7%) 1.3% ( -7% - 10%) Wildcard 70.08 (10.3%) 71.09 (6.1%) 1.4% ( -13% - 19%) BrowseDateTaxoFacets2.37 (0.2%)2.41 (0.3%) 1.5% ( 0% -1%) Prefix3 86.71 (11.4%) 89.04 (6.8%) 2.7% ( -13% - 23%) {code} was (Author: akjain): I have created [pull request|https://github.com/apache/lucene-solr/pull/563] with the proposed changes. Though surprisingly, I still see some impact on the PKLookup performance. {code:title=wikimedium10m|borderStyle=solid} TaskQPS baseline StdDevQPS candidate StdDev Pct diff PKLookup 117.45 (2.2%) 108.72 (2.3%) -7.4% ( -11% - -3%) OrHighNotMed 1094.23 (2.5%) 1057.88 (2.7%) -3.3% ( -8% -1%) OrHighNotLow 1047.30 (1.7%)
[jira] [Comment Edited] (LUCENE-8635) Lazy loading Lucene FST offheap using mmap
[ https://issues.apache.org/jira/browse/LUCENE-8635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16756098#comment-16756098 ] Mike Sokolov edited comment on LUCENE-8635 at 1/30/19 1:24 PM: --- I agree that would be a good start. Perhaps as a separate issue we can add finer per-field control of when to use on vs off-heap (per field, eg). Just to look a little way down that path: It seems that the nearest thing to do this today is {{get/setPreload()}} and {{get/setUseUnmap}} in {{MMapDirectory}}, but here one really wants a mapping by field name, and a Directory should not really bne concerned with field names. Better would be an attribute of {{FieldInfo}}, where we have {{put/getAttribute}}. Then {{FieldReader}} can inspect the {{FieldInfo}} and pass the appropriate {{On/OffHeapStore}} when creating its {{FST}}. What do you think? was (Author: sokolov): I agree that would be a good start. Perhaps as a separate issue we can add finer per-field control of when to use on vs off-heap (per field, eg). Just to look a little way down that path: It seems that the nearest thing to do this today is {{get/setPreload()}} and {{get/setUseUnmap}} in {{MMapDirectory}}, but here one really wants a mapping by field name, and a Directory should not really bne concerned with field names. Better would be an attribute of {{FieldInfo}}, where we have {{put/getAttribute}}. Then {{FieldReader}} can inspect the {{FieldInfo}} and pass the appropriate {{On/OffHeapStore}} when creating its {{FST}}. > Lazy loading Lucene FST offheap using mmap > -- > > Key: LUCENE-8635 > URL: https://issues.apache.org/jira/browse/LUCENE-8635 > Project: Lucene - Core > Issue Type: New Feature > Components: core/FSTs > Environment: I used below setup for es_rally tests: > single node i3.xlarge running ES 6.5 > es_rally was running on another i3.xlarge instance >Reporter: Ankit Jain >Priority: Major > Attachments: fst-offheap-ra-rev.patch, fst-offheap-rev.patch, > offheap.patch, optional_offheap_ra.patch, ra.patch, rally_benchmark.xlsx > > > Currently, FST loads all the terms into heap memory during index open. This > causes frequent JVM OOM issues if the term size gets big. A better way of > doing this will be to lazily load FST using mmap. That ensures only the > required terms get loaded into memory. > > Lucene can expose API for providing list of fields to load terms offheap. I'm > planning to take following approach for this: > # Add a boolean property fstOffHeap in FieldInfo > # Pass list of offheap fields to lucene during index open (ALL can be > special keyword for loading ALL fields offheap) > # Initialize the fstOffHeap property during lucene index open > # FieldReader invokes default FST constructor or OffHeap constructor based > on fstOffHeap field > > I created a patch (that loads all fields offheap), did some benchmarks using > es_rally and results look good. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-8635) Lazy loading Lucene FST offheap using mmap
[ https://issues.apache.org/jira/browse/LUCENE-8635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16755389#comment-16755389 ] Ankit Jain edited comment on LUCENE-8635 at 1/29/19 9:20 PM: - {quote}Given that the performance hit is mostly on PK lookups, maybe a starting point could be to always put the FST off-heap except when docCount == sumDocFreq, which suggests the field is an ID field.{quote} [~jpountz] - Does that exlude autogenerated id fields that are uuid, resulting in large FSTs? Elasticsearch for example has _id field, which IMO is better offheap. was (Author: akjain): {quote}Given that the performance hit is mostly on PK lookups, maybe a starting point could be to always put the FST off-heap except when docCount == sumDocFreq, which suggests the field is an ID field.{quote} [~jpountz] - Does that exlude autogenerated id fields that are uuid, resulting in huge FST? Elasticsearch for example has _id field, that is better offheap. > Lazy loading Lucene FST offheap using mmap > -- > > Key: LUCENE-8635 > URL: https://issues.apache.org/jira/browse/LUCENE-8635 > Project: Lucene - Core > Issue Type: New Feature > Components: core/FSTs > Environment: I used below setup for es_rally tests: > single node i3.xlarge running ES 6.5 > es_rally was running on another i3.xlarge instance >Reporter: Ankit Jain >Priority: Major > Attachments: fst-offheap-ra-rev.patch, fst-offheap-rev.patch, > offheap.patch, optional_offheap_ra.patch, ra.patch, rally_benchmark.xlsx > > > Currently, FST loads all the terms into heap memory during index open. This > causes frequent JVM OOM issues if the term size gets big. A better way of > doing this will be to lazily load FST using mmap. That ensures only the > required terms get loaded into memory. > > Lucene can expose API for providing list of fields to load terms offheap. I'm > planning to take following approach for this: > # Add a boolean property fstOffHeap in FieldInfo > # Pass list of offheap fields to lucene during index open (ALL can be > special keyword for loading ALL fields offheap) > # Initialize the fstOffHeap property during lucene index open > # FieldReader invokes default FST constructor or OffHeap constructor based > on fstOffHeap field > > I created a patch (that loads all fields offheap), did some benchmarks using > es_rally and results look good. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-8635) Lazy loading Lucene FST offheap using mmap
[ https://issues.apache.org/jira/browse/LUCENE-8635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16753609#comment-16753609 ] Ankit Jain edited comment on LUCENE-8635 at 1/27/19 10:14 PM: -- Results for bigger data sets: {code:title=wikimedium10m, java .. -DFST.offheap=true|borderStyle=solid} TaskQPS baseline StdDevQPS candidate StdDev Pct diff PKLookup 117.59 (3.0%) 107.48 (2.3%) -8.6% ( -13% - -3%) OrHighNotMed 1085.05 (2.1%) 1056.43 (2.2%) -2.6% ( -6% -1%) OrNotHighLow 976.94 (2.4%) 955.32 (1.8%) -2.2% ( -6% -2%) OrHighNotLow 1152.58 (2.6%) 1128.25 (2.0%) -2.1% ( -6% -2%) Fuzzy1 83.10 (2.6%) 81.54 (2.5%) -1.9% ( -6% -3%) IntNRQ 88.53 (16.2%) 86.92 (14.7%) -1.8% ( -28% - 34%) OrNotHighHigh 886.10 (1.7%) 870.26 (1.4%) -1.8% ( -4% -1%) OrHighNotHigh 838.32 (1.8%) 824.15 (1.9%) -1.7% ( -5% -2%) BrowseMonthTaxoFacets 8099.58 (2.0%) 7968.65 (1.8%) -1.6% ( -5% -2%) Fuzzy2 55.95 (2.7%) 55.08 (2.5%) -1.6% ( -6% -3%) OrNotHighMed 764.40 (2.3%) 752.56 (1.7%) -1.5% ( -5% -2%) BrowseDayOfYearTaxoFacets 8081.37 (2.1%) 7957.27 (2.7%) -1.5% ( -6% -3%) LowTerm 1941.88 (5.2%) 1912.71 (4.0%) -1.5% ( -10% -8%) HighTermMonthSort 78.12 (10.8%) 76.99 (14.3%) -1.4% ( -23% - 26%) Respell 61.23 (2.7%) 60.57 (2.7%) -1.1% ( -6% -4%) HighTerm 1526.16 (3.1%) 1510.23 (1.8%) -1.0% ( -5% -4%) MedTerm 1814.44 (3.7%) 1797.69 (2.1%) -0.9% ( -6% -5%) OrHighLow 443.93 (2.4%) 439.92 (2.5%) -0.9% ( -5% -4%) AndHighLow 577.60 (2.0%) 573.43 (1.4%) -0.7% ( -4% -2%) Wildcard 62.79 (5.8%) 62.54 (6.1%) -0.4% ( -11% - 12%) BrowseDayOfYearSSDVFacets 11.56 (8.0%) 11.55 (8.2%) -0.0% ( -15% - 17%) Prefix3 165.76 (8.7%) 165.70 (9.2%) -0.0% ( -16% - 19%) MedSpanNear 51.40 (2.3%) 51.48 (2.5%) 0.2% ( -4% -5%) BrowseMonthSSDVFacets 14.45 (13.6%) 14.47 (13.2%) 0.2% ( -23% - 31%) HighTermDayOfYearSort 44.98 (6.8%) 45.05 (5.3%) 0.2% ( -11% - 13%) OrHighMed 111.81 (3.0%) 112.01 (2.8%) 0.2% ( -5% -6%) LowSpanNear 47.14 (2.4%) 47.24 (2.5%) 0.2% ( -4% -5%) MedSloppyPhrase 48.25 (1.9%) 48.37 (2.3%) 0.2% ( -3% -4%) LowSloppyPhrase 35.36 (2.2%) 35.46 (2.5%) 0.3% ( -4% -5%) AndHighMed 144.05 (3.6%) 144.53 (2.7%) 0.3% ( -5% -6%) HighSpanNear6.92 (3.5%)6.95 (3.5%) 0.5% ( -6% -7%) MedPhrase 25.88 (2.4%) 26.00 (1.4%) 0.5% ( -3% -4%) AndHighHigh 38.77 (4.0%) 38.98 (3.9%) 0.5% ( -7% -8%) OrHighHigh 27.47 (3.2%) 27.63 (3.1%) 0.6% ( -5% -7%) LowPhrase 91.71 (4.3%) 92.56 (3.5%) 0.9% ( -6% -9%) HighSloppyPhrase 18.28 (3.2%) 18.45 (3.6%) 0.9% ( -5% -8%) HighPhrase 20.07 (3.9%) 20.35 (1.3%) 1.4% ( -3% -6%) BrowseDateTaxoFacets2.37 (0.4%)2.41 (0.2%) 1.4% ( 0% -2%) {code} was (Author: akjain): Results for bigger data sets: {code| title=wikimedium10m, java .. -DFST.offheap=true|borderStyle=solid} TaskQPS baseline StdDevQPS candidate StdDev Pct diff PKLookup 117.59 (3.0%) 107.48 (2.3%) -8.6% ( -13% - -3%) OrHighNotMed 1085.05 (2.1%) 1056.43 (2.2%) -2.6% ( -6% -1%) OrNotHighLow 976.94 (2.4%) 955.32 (1.8%) -2.2% ( -6% -2%) OrHighNotLow 1152.58 (2.6%) 1128.25 (2.0%) -2.1% ( -6% -2%) Fuzzy1 83.10 (2.6%) 81.54 (2.5%) -1.9% ( -6% -3%) IntNRQ 88.53 (16.2%)
[jira] [Comment Edited] (LUCENE-8635) Lazy loading Lucene FST offheap using mmap
[ https://issues.apache.org/jira/browse/LUCENE-8635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16753595#comment-16753595 ] Ankit Jain edited comment on LUCENE-8635 at 1/27/19 9:23 PM: - I also independently tried performance run after removing the array reversal in readBytes in original patch, but results looked similar to earlier results. Since, we are leaning towards keep this as optional, I created another patch - [^optional_offheap_ra.patch] based off reverse random access reader - [^ra.patch], that adds FST.offheap as system property to allow toggling between offheap and onheap. The results for wikimedium10k with: java .. -DFST.offheap=true {code} TaskQPS baseline StdDevQPS candidate StdDev Pct diff PKLookup 172.88 (3.3%) 153.94 (3.7%) -11.0% ( -17% - -4%) LowTerm12229.10 (3.5%)11032.10 (3.3%) -9.8% ( -16% - -3%) AndHighLow 4679.22 (3.2%) 4349.12 (3.3%) -7.1% ( -13% -0%) MedTerm10179.43 (5.4%) 9533.14 (3.4%) -6.3% ( -14% -2%) HighTerm 5123.89 (3.1%) 4814.09 (4.7%) -6.0% ( -13% -1%) LowPhrase 3459.57 (5.3%) 3253.20 (7.5%) -6.0% ( -17% -7%) MedPhrase 2815.82 (5.1%) 2654.13 (5.6%) -5.7% ( -15% -5%) MedSpanNear 2196.98 (4.4%) 2082.39 (3.9%) -5.2% ( -12% -3%) HighSloppyPhrase 1680.32 (5.7%) 1592.91 (8.0%) -5.2% ( -17% -9%) LowSloppyPhrase 3205.99 (4.9%) 3045.94 (4.4%) -5.0% ( -13% -4%) OrHighMed 1960.52 (4.8%) 1866.03 (6.2%) -4.8% ( -15% -6%) Wildcard 1388.45 (8.5%) 1324.82 (6.2%) -4.6% ( -17% - 11%) OrHighHigh 1304.03 (7.8%) 1247.72 (5.1%) -4.3% ( -16% -9%) AndHighMed 2268.22 (2.8%) 2171.27 (2.8%) -4.3% ( -9% -1%) MedSloppyPhrase 2697.01 (6.1%) 2597.71 (5.0%) -3.7% ( -13% -7%) HighTermDayOfYearSort 1719.25 (5.3%) 1657.10 (5.8%) -3.6% ( -13% -7%) HighSpanNear 1624.69 (4.4%) 1567.35 (5.6%) -3.5% ( -12% -6%) AndHighHigh 1645.28 (3.7%) 1589.76 (2.9%) -3.4% ( -9% -3%) LowSpanNear 2319.98 (6.0%) 2246.30 (5.5%) -3.2% ( -13% -8%) OrHighLow 2264.00 (6.0%) 2200.33 (4.3%) -2.8% ( -12% -7%) HighTermMonthSort 4829.60 (3.9%) 4700.35 (2.5%) -2.7% ( -8% -3%) Fuzzy2 172.46 (4.8%) 168.02 (5.4%) -2.6% ( -12% -8%) HighPhrase 2525.60 (6.3%) 2464.09 (5.3%) -2.4% ( -13% -9%) Fuzzy1 585.39 (4.4%) 571.20 (4.1%) -2.4% ( -10% -6%) Prefix3 1359.75 (8.2%) 1330.98 (5.8%) -2.1% ( -14% - 12%) Respell 501.29 (3.2%) 490.92 (4.7%) -2.1% ( -9% -5%) BrowseMonthTaxoFacets 8450.33 (4.7%) 8354.07 (4.9%) -1.1% ( -10% -8%) BrowseDayOfYearSSDVFacets 2016.73 (3.4%) 2009.96 (4.0%) -0.3% ( -7% -7%) BrowseDayOfYearTaxoFacets 8303.67 (6.4%) 8294.91 (5.6%) -0.1% ( -11% - 12%) IntNRQ 1380.11 (2.1%) 1380.36 (2.0%) 0.0% ( -3% -4%) BrowseDateTaxoFacets 3564.47 (3.2%) 3575.88 (3.2%) 0.3% ( -5% -7%) BrowseMonthSSDVFacets 2247.87 (5.4%) 2276.28 (3.5%) 1.3% ( -7% - 10%) {code} java .. -DFST.offheap=false {{TaskQPS baseline StdDevQPS candidate StdDev Pct diff LowPhrase 3244.01 (6.3%) 3201.30 (7.0%) -1.3% ( -13% - 12%) PKLookup 171.24 (3.3%) 169.28 (5.3%) -1.1% ( -9% -7%) MedSloppyPhrase 2867.58 (6.3%) 2848.80 (6.9%) -0.7% ( -13% - 13%) BrowseMonthTaxoFacets 8565.92 (4.9%) 8514.51 (5.3%) -0.6% ( -10% - 10%) Respell 529.20 (3.6%) 526.69 (3.4%) -0.5% ( -7% -6%) Wildcard 1252.25 (7.6%) 1249.97 (7.3%) -0.2% ( -13% - 15%) IntNRQ 1536.74 (1.7%) 1536.53 (2.1%) -0.0% ( -3% -3%) BrowseDayOfYearTaxoFacets 8490.89 (6.3%) 8490.94 (5.5%) 0.0% ( -11% - 12%) LowSpanNear 2391.88 (3.0%) 2392.15
[jira] [Comment Edited] (LUCENE-8635) Lazy loading Lucene FST offheap using mmap
[ https://issues.apache.org/jira/browse/LUCENE-8635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16753595#comment-16753595 ] Ankit Jain edited comment on LUCENE-8635 at 1/27/19 9:26 PM: - I also independently tried performance run after removing the array reversal in readBytes in original patch, but results looked similar to earlier results. Since, we are leaning towards keep this as optional, I created another patch - [^optional_offheap_ra.patch] based off reverse random access reader - [^ra.patch], that adds FST.offheap as system property to allow toggling between offheap and onheap. The results for wikimedium10k with: java .. -DFST.offheap=true {code} TaskQPS baseline StdDevQPS candidate StdDev Pct diff PKLookup 172.88 (3.3%) 153.94 (3.7%) -11.0% ( -17% - -4%) LowTerm12229.10 (3.5%)11032.10 (3.3%) -9.8% ( -16% - -3%) AndHighLow 4679.22 (3.2%) 4349.12 (3.3%) -7.1% ( -13% -0%) MedTerm10179.43 (5.4%) 9533.14 (3.4%) -6.3% ( -14% -2%) HighTerm 5123.89 (3.1%) 4814.09 (4.7%) -6.0% ( -13% -1%) LowPhrase 3459.57 (5.3%) 3253.20 (7.5%) -6.0% ( -17% -7%) MedPhrase 2815.82 (5.1%) 2654.13 (5.6%) -5.7% ( -15% -5%) MedSpanNear 2196.98 (4.4%) 2082.39 (3.9%) -5.2% ( -12% -3%) HighSloppyPhrase 1680.32 (5.7%) 1592.91 (8.0%) -5.2% ( -17% -9%) LowSloppyPhrase 3205.99 (4.9%) 3045.94 (4.4%) -5.0% ( -13% -4%) OrHighMed 1960.52 (4.8%) 1866.03 (6.2%) -4.8% ( -15% -6%) Wildcard 1388.45 (8.5%) 1324.82 (6.2%) -4.6% ( -17% - 11%) OrHighHigh 1304.03 (7.8%) 1247.72 (5.1%) -4.3% ( -16% -9%) AndHighMed 2268.22 (2.8%) 2171.27 (2.8%) -4.3% ( -9% -1%) MedSloppyPhrase 2697.01 (6.1%) 2597.71 (5.0%) -3.7% ( -13% -7%) HighTermDayOfYearSort 1719.25 (5.3%) 1657.10 (5.8%) -3.6% ( -13% -7%) HighSpanNear 1624.69 (4.4%) 1567.35 (5.6%) -3.5% ( -12% -6%) AndHighHigh 1645.28 (3.7%) 1589.76 (2.9%) -3.4% ( -9% -3%) LowSpanNear 2319.98 (6.0%) 2246.30 (5.5%) -3.2% ( -13% -8%) OrHighLow 2264.00 (6.0%) 2200.33 (4.3%) -2.8% ( -12% -7%) HighTermMonthSort 4829.60 (3.9%) 4700.35 (2.5%) -2.7% ( -8% -3%) Fuzzy2 172.46 (4.8%) 168.02 (5.4%) -2.6% ( -12% -8%) HighPhrase 2525.60 (6.3%) 2464.09 (5.3%) -2.4% ( -13% -9%) Fuzzy1 585.39 (4.4%) 571.20 (4.1%) -2.4% ( -10% -6%) Prefix3 1359.75 (8.2%) 1330.98 (5.8%) -2.1% ( -14% - 12%) Respell 501.29 (3.2%) 490.92 (4.7%) -2.1% ( -9% -5%) BrowseMonthTaxoFacets 8450.33 (4.7%) 8354.07 (4.9%) -1.1% ( -10% -8%) BrowseDayOfYearSSDVFacets 2016.73 (3.4%) 2009.96 (4.0%) -0.3% ( -7% -7%) BrowseDayOfYearTaxoFacets 8303.67 (6.4%) 8294.91 (5.6%) -0.1% ( -11% - 12%) IntNRQ 1380.11 (2.1%) 1380.36 (2.0%) 0.0% ( -3% -4%) BrowseDateTaxoFacets 3564.47 (3.2%) 3575.88 (3.2%) 0.3% ( -5% -7%) BrowseMonthSSDVFacets 2247.87 (5.4%) 2276.28 (3.5%) 1.3% ( -7% - 10%) {code} java .. -DFST.offheap=false {code}TaskQPS baseline StdDevQPS candidate StdDev Pct diff LowPhrase 3244.01 (6.3%) 3201.30 (7.0%) -1.3% ( -13% - 12%) PKLookup 171.24 (3.3%) 169.28 (5.3%) -1.1% ( -9% -7%) MedSloppyPhrase 2867.58 (6.3%) 2848.80 (6.9%) -0.7% ( -13% - 13%) BrowseMonthTaxoFacets 8565.92 (4.9%) 8514.51 (5.3%) -0.6% ( -10% - 10%) Respell 529.20 (3.6%) 526.69 (3.4%) -0.5% ( -7% -6%) Wildcard 1252.25 (7.6%) 1249.97 (7.3%) -0.2% ( -13% - 15%) IntNRQ 1536.74 (1.7%) 1536.53 (2.1%) -0.0% ( -3% -3%) BrowseDayOfYearTaxoFacets 8490.89 (6.3%) 8490.94 (5.5%) 0.0% ( -11% - 12%) LowSpanNear 2391.88 (3.0%) 2392.15
[jira] [Comment Edited] (LUCENE-8635) Lazy loading Lucene FST offheap using mmap
[ https://issues.apache.org/jira/browse/LUCENE-8635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16753468#comment-16753468 ] Mike Sokolov edited comment on LUCENE-8635 at 1/27/19 5:51 PM: --- I tried that [~akjain] and stumbled into a trap that got a big drop in performance! I just used a wrapper around {{IndexInput}} rather than the random access approach (using {{randomAccessSlice}}) and implemented {{skipBytes}} in the obvious way: by calling the delegate's {{skipBytes}}. But this is bad. The default implementation of that method comes from {{DataInput}} and that actually reads bytes into a buffer rather than simply updating a pointer. I'm not sure I understand the rationale for that - it seems to have to do with checksumming? Possibly {{ByteBuffer(s)IndexInput}} could (should?) implement this more efficiently, or maybe it's required to do this reading -- not sure. At any rate I think in this case we really just want to seek the pointer, so we can have our {{FST.BytesReader.skipBytes}} call {{IndexInput.seek}} instead of {{IndexInput.skipBytes}}. was (Author: sokolov): I tried that [~akjain] and strangely got a big drop in performance! I just used a wrapper around {{IndexInput}} rather than the random access approach (using {{randomAccessSlice}}) and implemented {{skipBytes}} in the obvious way: by calling the delegate's {{skipBytes}}. But this is bad. The default implementation of that method comes from {{DataInput}} and that actually reads bytes into a buffer rather than simply updating a pointer. I'm not sure I understand the rationale for that - it seems to have to do with checksumming? Possibly {{ByteBuffer(s)IndexInput}} could (should?) implement this more efficiently, or maybe it's required to do this reading -- not sure. At any rate I think in this case we really just want to seek the pointer, so we can have our {{FST.BytesReader.skipBytes}} call {{IndexInput.seek}} instead of {{IndexInput.skipBytes}}. > Lazy loading Lucene FST offheap using mmap > -- > > Key: LUCENE-8635 > URL: https://issues.apache.org/jira/browse/LUCENE-8635 > Project: Lucene - Core > Issue Type: New Feature > Components: core/FSTs > Environment: I used below setup for es_rally tests: > single node i3.xlarge running ES 6.5 > es_rally was running on another i3.xlarge instance >Reporter: Ankit Jain >Priority: Major > Attachments: fst-offheap-ra-rev.patch, offheap.patch, ra.patch, > rally_benchmark.xlsx > > > Currently, FST loads all the terms into heap memory during index open. This > causes frequent JVM OOM issues if the term size gets big. A better way of > doing this will be to lazily load FST using mmap. That ensures only the > required terms get loaded into memory. > > Lucene can expose API for providing list of fields to load terms offheap. I'm > planning to take following approach for this: > # Add a boolean property fstOffHeap in FieldInfo > # Pass list of offheap fields to lucene during index open (ALL can be > special keyword for loading ALL fields offheap) > # Initialize the fstOffHeap property during lucene index open > # FieldReader invokes default FST constructor or OffHeap constructor based > on fstOffHeap field > > I created a patch (that loads all fields offheap), did some benchmarks using > es_rally and results look good. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-8635) Lazy loading Lucene FST offheap using mmap
[ https://issues.apache.org/jira/browse/LUCENE-8635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16750316#comment-16750316 ] Ankit Jain edited comment on LUCENE-8635 at 1/23/19 6:47 PM: - {quote}Ankit Jain unfortunately RandomAccessInput doesn't offer readBytes. I'm looking into adding it; shouldn't be hard as there aren't that many implementations.{quote} You don't need to use RandomAccessInput. You can revert back to original IndexInputReader and get rid of the reversal logic. {code:title=ForwardIndexInputReader|borderStyle=solid} /** Implements forward read for FST from an index input. */ final class ForwardIndexInputReader extends FST.BytesReader { private final IndexInput in; private final long startFP; public ReverseIndexInputReader(IndexInput in, long startFP) { this.in = in; this.startFP = startFP; } @Override public byte readByte() throws IOException { return this.in.readByte(); } @Override public void readBytes(byte[] b, int offset, int len) throws IOException { this.in.readBytes(b, offset, len); } @Override public void skipBytes(long count) { this.setPosition(this.getPosition() + count); } @Override public long getPosition() { final long position = this.in.getFilePointer() - startFP; return position; } @Override public void setPosition(long pos) { try { this.in.seek(startFP + pos); } catch (IOException ex) { System.out.println(String.format("Unreported exception in set position at %d - %s", pos, ex.getMessage())); } } @Override public boolean reversed() { return false; } } {code} {quote}Furthermore the NIO and Simple FS directories use buffering. I'm wondering how bad things would be if every seek would need to reload the buffer?{quote} This can be serious concern for NIO and Simple FS systems. Given that most of the systems today use mmap, can we limit the offheap FST to mmap supported systems i.e. {code:title=isMMapSupported|borderStyle=solid} Constants.JRE_IS_64BIT && MMapDirectory.UNMAP_SUPPORTED {code} was (Author: akjain): {quote}Ankit Jain unfortunately RandomAccessInput doesn't offer readBytes. I'm looking into adding it; shouldn't be hard as there aren't that many implementations.{quote} You don't need to use RandomAccessInput. You can revert back to original IndexInputReader and get rid of the reversal logic. {code:title=ForwardIndexInputReader|borderStyle=Solid} /** Implements reverse read from an index input. */ final class ForwardIndexInputReader extends FST.BytesReader { private final IndexInput in; private final long startFP; public ReverseIndexInputReader(IndexInput in, long startFP) { this.in = in; this.startFP = startFP; } @Override public byte readByte() throws IOException { return this.in.readByte(); } @Override public void readBytes(byte[] b, int offset, int len) throws IOException { this.in.readBytes(b, offset, len); } @Override public void skipBytes(long count) { this.setPosition(this.getPosition() + count); } @Override public long getPosition() { final long position = this.in.getFilePointer() - startFP; return position; } @Override public void setPosition(long pos) { try { this.in.seek(startFP + pos); } catch (IOException ex) { System.out.println(String.format("Unreported exception in set position at %d - %s", pos, ex.getMessage())); } } @Override public boolean reversed() { return false; } } {code} {quote}Furthermore the NIO and Simple FS directories use buffering. I'm wondering how bad things would be if every seek would need to reload the buffer?{quote} This can be serious concern for NIO and Simple FS systems. Given that most of the systems today use mmap, can we limit the offheap FST to mmap supported systems i.e. {code:title=isMMapSupported|borderStyle=Solid} Constants.JRE_IS_64BIT && MMapDirectory.UNMAP_SUPPORTED {code} > Lazy loading Lucene FST offheap using mmap > -- > > Key: LUCENE-8635 > URL: https://issues.apache.org/jira/browse/LUCENE-8635 > Project: Lucene - Core > Issue Type: New Feature > Components: core/FSTs > Environment: I used below setup for es_rally tests: > single node i3.xlarge running ES 6.5 > es_rally was running on another i3.xlarge instance >Reporter: Ankit Jain >Priority: Major > Attachments: fst-offheap-ra-rev.patch, offheap.patch, ra.patch, > rally_benchmark.xlsx > > > Currently, FST loads all the terms into heap memory during index open. This > causes frequent JVM OOM
[jira] [Comment Edited] (LUCENE-8635) Lazy loading Lucene FST offheap using mmap
[ https://issues.apache.org/jira/browse/LUCENE-8635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16749180#comment-16749180 ] Ankit Jain edited comment on LUCENE-8635 at 1/22/19 9:41 PM: - {quote}Technically we could make things work for existing segments since your patch doesn't change the file format.{quote} [~jpountz] - I'm curious on how this can be done. I looked at the code and it seemed that all settings are passed to the segment writer and writer should put those settings in codec for reader to consume. Do you have any pointers on this? {quote}I agree it's a bit unlikely that the terms index gets paged out, but you can still end up with a cold FS cache eg. when the host restarts?{quote} There can be option for preloading terms index during index open. Even though, lucene already provides option for preloading mapped buffer [here|https://github.com/apache/lucene-solr/blob/master/lucene/core/src/java/org/apache/lucene/store/MMapDirectory.java#L95], it is done at directory level and not file level. Though, elasticsearch worked around that to provide [file level setting|https://www.elastic.co/guide/en/elasticsearch/reference/master/_pre_loading_data_into_the_file_system_cache.html] {quote}For the record, Lucene also performs implicit PK lookups when indexing with updateDocument. So this might have an impact on indexing speed as well.{quote} If customer workload is updateDocument heavy, the impact should be minimal, as terms index will get loaded into memory after first fault for every page and then there should not be any page faults. If customers are sensitive to latency, they can use the preload option for terms index. {quote}Wondering whether avoiding 'array reversal' in the second patch is what helped rather than moving to random access and removing skip? May be we should try with reading one byte at a time with original patch.{quote} I overlooked that earlier and attributed performance gain to absence of seek operation. This makes lot more sense, will try to do some by changing readBytes to below: {code:title=ReverseIndexInputReader.java|borderStyle=solid} public byte readByte() throws IOException { final byte b = this.in.readByte(); this.skipBytes(2); return b; } public void readBytes(byte[] b, int offset, int len) throws IOException { for (int i=offset+len-1; i>=offset; i--) { b[i] = this.readByte(); } } {code} {quote}I uploaded a patch that combines these three things: off-heap FST + random-access reader + reversal of the FST so it is forward-read. Unit tests are passing; I'm running some benchmarks to see what the impact is on performance{quote} That's great Mike. If this works, we don't need the reverse reader. We don't even need the random-access reader, as we can simply change readBytes to below: {code:title=ReverseIndexInputReader.java|borderStyle=solid} public void readBytes(byte[] b, int offset, int len) throws IOException { this.in.readBytes(b, offset, len); } {code} was (Author: akjain): bq. {quote}Technically we could make things work for existing segments since your patch doesn't change the file format.{quote} [~jpountz] - I'm curious on how this can be done. I looked at the code and it seemed that all settings are passed to the segment writer and writer should put those settings in codec for reader to consume. Do you have any pointers on this? {quote}I agree it's a bit unlikely that the terms index gets paged out, but you can still end up with a cold FS cache eg. when the host restarts?{quote} There can be option for preloading terms index during index open. Even though, lucene already provides option for preloading mapped buffer [here|https://github.com/apache/lucene-solr/blob/master/lucene/core/src/java/org/apache/lucene/store/MMapDirectory.java#L95], it is done at directory level and not file level. Though, elasticsearch worked around that to provide [file level setting|https://www.elastic.co/guide/en/elasticsearch/reference/master/_pre_loading_data_into_the_file_system_cache.html] {quote}For the record, Lucene also performs implicit PK lookups when indexing with updateDocument. So this might have an impact on indexing speed as well.{quote} If customer workload is updateDocument heavy, the impact should be minimal, as terms index will get loaded into memory after first fault for every page and then there should not be any page faults. If customers are sensitive to latency, they can use the preload option for terms index. {quote}Wondering whether avoiding 'array reversal' in the second patch is what helped rather than moving to random access and removing skip? May be we should try with reading one byte at a time with original patch.{quote} I overlooked that earlier and attributed performance gain to absence of seek operation. This makes lot more sense,
[jira] [Comment Edited] (LUCENE-8635) Lazy loading Lucene FST offheap using mmap
[ https://issues.apache.org/jira/browse/LUCENE-8635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16749180#comment-16749180 ] Ankit Jain edited comment on LUCENE-8635 at 1/22/19 9:40 PM: - bq. {quote}Technically we could make things work for existing segments since your patch doesn't change the file format.{quote} [~jpountz] - I'm curious on how this can be done. I looked at the code and it seemed that all settings are passed to the segment writer and writer should put those settings in codec for reader to consume. Do you have any pointers on this? {quote}I agree it's a bit unlikely that the terms index gets paged out, but you can still end up with a cold FS cache eg. when the host restarts?{quote} There can be option for preloading terms index during index open. Even though, lucene already provides option for preloading mapped buffer [here|https://github.com/apache/lucene-solr/blob/master/lucene/core/src/java/org/apache/lucene/store/MMapDirectory.java#L95], it is done at directory level and not file level. Though, elasticsearch worked around that to provide [file level setting|https://www.elastic.co/guide/en/elasticsearch/reference/master/_pre_loading_data_into_the_file_system_cache.html] {quote}For the record, Lucene also performs implicit PK lookups when indexing with updateDocument. So this might have an impact on indexing speed as well.{quote} If customer workload is updateDocument heavy, the impact should be minimal, as terms index will get loaded into memory after first fault for every page and then there should not be any page faults. If customers are sensitive to latency, they can use the preload option for terms index. {quote}Wondering whether avoiding 'array reversal' in the second patch is what helped rather than moving to random access and removing skip? May be we should try with reading one byte at a time with original patch.{quote} I overlooked that earlier and attributed performance gain to absence of seek operation. This makes lot more sense, will try to do some by changing readBytes to below: {code:title=ReverseIndexInputReader.java|borderStyle=solid} public byte readByte() throws IOException { final byte b = this.in.readByte(); this.skipBytes(2); return b; } public void readBytes(byte[] b, int offset, int len) throws IOException { for (int i=offset+len-1; i>=offset; i--) { b[i] = this.readByte(); } } {code} bq. {quote}I uploaded a patch that combines these three things: off-heap FST + random-access reader + reversal of the FST so it is forward-read. Unit tests are passing; I'm running some benchmarks to see what the impact is on performance{quote} That's great Mike. If this works, we don't need the reverse reader. We don't even need the random-access reader, as we can simply change readBytes to below: {code:title=ReverseIndexInputReader.java|borderStyle=solid} public void readBytes(byte[] b, int offset, int len) throws IOException { this.in.readBytes(b, offset, len); } {code} was (Author: akjain): bq. {quote}Technically we could make things work for existing segments since your patch doesn't change the file format.{quote} [~jpountz] - I'm curious on how this can be done. I looked at the code and it seemed that all settings are passed to the segment writer and writer should put those settings in codec for reader to consume. Do you have any pointers on this? {quote}I agree it's a bit unlikely that the terms index gets paged out, but you can still end up with a cold FS cache eg. when the host restarts?{quote} There can be option for preloading terms index during index open. Even though, lucene already provides option for preloading mapped buffer [here|https://github.com/apache/lucene-solr/blob/master/lucene/core/src/java/org/apache/lucene/store/MMapDirectory.java#L95], it is done at directory level and not file level. Though, elasticsearch worked around that to provide [file level setting|https://www.elastic.co/guide/en/elasticsearch/reference/master/_pre_loading_data_into_the_file_system_cache.html] {quote}For the record, Lucene also performs implicit PK lookups when indexing with updateDocument. So this might have an impact on indexing speed as well.{quote} If customer workload is updateDocument heavy, the impact should be minimal, as terms index will get loaded into memory after first fault for every page and then there should not be any page faults. If customers are sensitive to latency, they can use the preload option for terms index. {quote}Wondering whether avoiding 'array reversal' in the second patch is what helped rather than moving to random access and removing skip? May be we should try with reading one byte at a time with original patch.{quote} I overlooked that earlier and attributed performance gain to absence of seek operation. This makes lot
[jira] [Comment Edited] (LUCENE-8635) Lazy loading Lucene FST offheap using mmap
[ https://issues.apache.org/jira/browse/LUCENE-8635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16744344#comment-16744344 ] Mike Sokolov edited comment on LUCENE-8635 at 1/16/19 6:54 PM: --- Following a suggestion from [~mikemccand] I tried a slightly different version of this, making use of randomAccessSlice to avoid some calls to seek(), and this gives better perf in the benchmarks. I also spent some time trying to understand FST's backwards-seeking behavior. Based on my crude understanding, and comment from Mike again, it seems as if with some work it would be possible to make it more naturally forward-seeking, but it's not obvious that in general you would get more local cache-friendly access patterns from that. Still you might; probably needs some experimentation to know for sure. Here are the benchmark #s from the random-access patch: {noformat} Task QPS before StdDev QPS after StdDev Pct diff PKLookup 133.62 (2.2%) 123.74 (1.5%) -7.4% ( -10% - -3%) AndHighLow 3411.49 (3.2%) 3268.04 (3.1%) -4.2% ( -10% -2%) BrowseDayOfYearTaxoFacets10067.18 (4.3%) 9828.65 (3.5%) -2.4% ( -9% -5%) LowTerm 3567.48 (1.2%) 3489.27 (1.7%) -2.2% ( -5% -0%) Fuzzy1 147.67 (3.1%) 144.65 (2.4%) -2.0% ( -7% -3%) BrowseMonthTaxoFacets10102.27 (4.2%) 9901.49 (4.1%) -2.0% ( -9% -6%) Fuzzy2 62.00 (2.8%) 60.87 (2.4%) -1.8% ( -6% -3%) MedTerm 2694.87 (2.0%) 2647.08 (2.1%) -1.8% ( -5% -2%) AndHighMed 1171.52 (2.7%) 1154.25 (2.8%) -1.5% ( -6% -4%) HighTerm 2061.53 (2.3%) 2032.84 (2.5%) -1.4% ( -6% -3%) MedSloppyPhrase 266.60 (3.4%) 263.01 (4.2%) -1.3% ( -8% -6%) OrHighHigh 278.90 (4.0%) 275.35 (4.7%) -1.3% ( -9% -7%) HighSloppyPhrase 107.68 (5.5%) 106.34 (5.6%) -1.2% ( -11% - 10%) Respell 118.26 (2.1%) 116.95 (2.2%) -1.1% ( -5% -3%) AndHighHigh 472.93 (4.4%) 467.78 (3.3%) -1.1% ( -8% -6%) OrHighMed 755.21 (2.9%) 748.34 (3.3%) -0.9% ( -6% -5%) MedSpanNear 308.31 (3.3%) 305.59 (3.8%) -0.9% ( -7% -6%) Wildcard 869.37 (3.5%) 862.74 (1.9%) -0.8% ( -5% -4%) HighTermMonthSort 871.33 (7.1%) 865.80 (6.1%) -0.6% ( -12% - 13%) MedPhrase 449.39 (3.0%) 446.55 (2.4%) -0.6% ( -5% -4%) LowSpanNear 391.10 (3.3%) 388.77 (3.8%) -0.6% ( -7% -6%) LowSloppyPhrase 406.57 (3.8%) 404.23 (3.6%) -0.6% ( -7% -7%) HighPhrase 239.84 (3.7%) 238.78 (3.3%) -0.4% ( -7% -6%) Prefix3 1230.56 (5.0%) 1225.52 (2.9%) -0.4% ( -7% -7%) HighSpanNear 107.34 (5.2%) 107.20 (5.3%) -0.1% ( -10% - 10%) LowPhrase 438.52 (3.4%) 438.14 (2.5%) -0.1% ( -5% -5%) BrowseDateTaxoFacets 11.14 (4.0%) 11.16 (7.0%) 0.2% ( -10% - 11%) HighTermDayOfYearSort 606.85 (6.7%) 608.65 (5.4%) 0.3% ( -11% - 13%) IntNRQ 987.08 (12.5%) 990.96 (13.5%) 0.4% ( -22% - 30%) OrHighLow 553.72 (3.2%) 558.09 (3.5%) 0.8% ( -5% -7%) BrowseDayOfYearSSDVFacets 38.23 (3.9%) 38.66 (4.1%) 1.1% ( -6% -9%) BrowseMonthSSDVFacets 42.05 (3.5%) 42.57 (3.7%) 1.2% ( -5% -8%) {noformat} was (Author: sokolov): Following a suggestion from ~mikemccand I tried a slightly different version of this, making use of randomAccessSlice to avoid some calls to seek(), and this gives better perf in the benchmarks. I also spent some time trying to understand FST's backwards-seeking behavior. Based on my crude understanding, and comment from Mike again, it seems as if with some work it would be possible to make it more naturally forward-seeking, but it's not obvious that in general you would get more local cache-friendly access patterns from that. Still you might; probably needs some experimentation to know for sure. Here are the benchmark #s from the random-access patch: {noformat} Task QPS before StdDev QPS after StdDev
[jira] [Comment Edited] (LUCENE-8635) Lazy loading Lucene FST offheap using mmap
[ https://issues.apache.org/jira/browse/LUCENE-8635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16740855#comment-16740855 ] Ankit Jain edited comment on LUCENE-8635 at 1/12/19 4:58 AM: - The excel sheet is big, so pasting here might not help? You have good point about moving FSTs off-heap in the default codec as we can always preload mmap file during index open as demonstrated [here|https://www.elastic.co/guide/en/elasticsearch/reference/master/_pre_loading_data_into_the_file_system_cache.html] I ran the default lucene test suite and couple of tests seem to fail. Though, they don't seem to have anything to do with my change: [junit4] Tests with failures [seed: 1D3ADDF6AE377902]: [junit4] - org.apache.solr.cloud.autoscaling.ScheduledMaintenanceTriggerTest.testInactiveShardCleanup [junit4] - org.apache.solr.cloud.autoscaling.ScheduledTriggerTest.testTrigger [junit4] Execution time total: 1 hour 12 minutes 40 seconds [junit4] Tests summary: 833 suites (7 ignored), 4024 tests, 2 failures, 286 ignored (153 assumptions) UPDATE: The tests passed after retrying individually. was (Author: akjain): The excel sheet is big, so pasting here might not help? You have good point about moving FSTs off-heap in the default codec as we can always preload mmap file during index open as demonstrated [here|https://www.elastic.co/guide/en/elasticsearch/reference/master/_pre_loading_data_into_the_file_system_cache.html] I ran the default lucene test suite and couple of tests seem to fail. Though, they don't seem to have anything to do with my change: [junit4] Tests with failures [seed: 1D3ADDF6AE377902]: [junit4] - org.apache.solr.cloud.autoscaling.ScheduledMaintenanceTriggerTest.testInactiveShardCleanup [junit4] - org.apache.solr.cloud.autoscaling.ScheduledTriggerTest.testTrigger [junit4] [junit4] [junit4] JVM J0: 1.40 .. 4359.18 = 4357.78s [junit4] JVM J1: 1.40 .. 4359.35 = 4357.95s [junit4] JVM J2: 1.40 .. 4359.30 = 4357.90s [junit4] Execution time total: 1 hour 12 minutes 40 seconds [junit4] Tests summary: 833 suites (7 ignored), 4024 tests, 2 failures, 286 ignored (153 assumptions) Details for failing tests NOTE: reproduce with: ant test -Dtestcase=ScheduledTriggerTest -Dtests.method=testTrigger -Dtests.seed=1D3ADDF6AE377902 -Dtests.slow=true -Dtests.badapples=true -Dtests.locale=mr-IN -Dtests.timezone=America/St_Lucia -Dtests.asserts=true -Dtests.file.encoding=US-ASCII [junit4] FAILURE 9.03s J2 | ScheduledTriggerTest.testTrigger <<< [junit4] > Throwable #1: java.lang.AssertionError: expected:<3> but was:<2> [junit4] > at __randomizedtesting.SeedInfo.seed([1D3ADDF6AE377902:7EF1EB7437F80A2F]:0) [junit4] > at org.apache.solr.cloud.autoscaling.ScheduledTriggerTest.scheduledTriggerTest(ScheduledTriggerTest.java:113) [junit4] > at org.apache.solr.cloud.autoscaling.ScheduledTriggerTest.testTrigger(ScheduledTriggerTest.java:66) [junit4] > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) [junit4] > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) [junit4] > at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) [junit4] > at java.base/java.lang.reflect.Method.invoke(Method.java:564) [junit4] > at java.base/java.lang.Thread.run(Thread.java:844) NOTE: reproduce with: ant test -Dtestcase=ScheduledMaintenanceTriggerTest -Dtests.method=testInactiveShardCleanup -Dtests.seed=1D3ADDF6AE377902 -Dtests.slow=true -Dtests.badapples=true -Dtests.locale=ha -Dtests.timezone=America/Nome -Dtests.asserts=true -Dtests.file.encoding=US-ASCII [junit4] FAILURE 2.01s J0 | ScheduledMaintenanceTriggerTest.testInactiveShardCleanup <<< at __randomizedtesting.SeedInfo.seed([1D3ADDF6AE377902:161D84CF745E09]:0) [junit4] > at org.apache.solr.cloud.CloudTestUtils.waitForState(CloudTestUtils.java:70) [junit4] > at org.apache.solr.cloud.autoscaling.ScheduledMaintenanceTriggerTest.testInactiveShardCleanup(ScheduledMaintenanceTriggerTest.java:167) [junit4] > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) [junit4] > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) [junit4] > at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) [junit4] > at java.base/java.lang.reflect.Method.invoke(Method.java:564) [junit4] > at java.base/java.lang.Thread.run(Thread.java:844) [junit4] > Caused by:
[jira] [Comment Edited] (LUCENE-8635) Lazy loading Lucene FST offheap using mmap
[ https://issues.apache.org/jira/browse/LUCENE-8635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16740855#comment-16740855 ] Ankit Jain edited comment on LUCENE-8635 at 1/12/19 12:08 AM: -- The excel sheet is big, so pasting here might not help? You have good point about moving FSTs off-heap in the default codec as we can always preload mmap file during index open as demonstrated [here|https://www.elastic.co/guide/en/elasticsearch/reference/master/_pre_loading_data_into_the_file_system_cache.html] I ran the default lucene test suite and couple of tests seem to fail. Though, they don't seem to have anything to do with my change: [junit4] Tests with failures [seed: 1D3ADDF6AE377902]: [junit4] - org.apache.solr.cloud.autoscaling.ScheduledMaintenanceTriggerTest.testInactiveShardCleanup [junit4] - org.apache.solr.cloud.autoscaling.ScheduledTriggerTest.testTrigger [junit4] [junit4] [junit4] JVM J0: 1.40 .. 4359.18 = 4357.78s [junit4] JVM J1: 1.40 .. 4359.35 = 4357.95s [junit4] JVM J2: 1.40 .. 4359.30 = 4357.90s [junit4] Execution time total: 1 hour 12 minutes 40 seconds [junit4] Tests summary: 833 suites (7 ignored), 4024 tests, 2 failures, 286 ignored (153 assumptions) Details for failing tests NOTE: reproduce with: ant test -Dtestcase=ScheduledTriggerTest -Dtests.method=testTrigger -Dtests.seed=1D3ADDF6AE377902 -Dtests.slow=true -Dtests.badapples=true -Dtests.locale=mr-IN -Dtests.timezone=America/St_Lucia -Dtests.asserts=true -Dtests.file.encoding=US-ASCII [junit4] FAILURE 9.03s J2 | ScheduledTriggerTest.testTrigger <<< [junit4] > Throwable #1: java.lang.AssertionError: expected:<3> but was:<2> [junit4] > at __randomizedtesting.SeedInfo.seed([1D3ADDF6AE377902:7EF1EB7437F80A2F]:0) [junit4] > at org.apache.solr.cloud.autoscaling.ScheduledTriggerTest.scheduledTriggerTest(ScheduledTriggerTest.java:113) [junit4] > at org.apache.solr.cloud.autoscaling.ScheduledTriggerTest.testTrigger(ScheduledTriggerTest.java:66) [junit4] > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) [junit4] > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) [junit4] > at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) [junit4] > at java.base/java.lang.reflect.Method.invoke(Method.java:564) [junit4] > at java.base/java.lang.Thread.run(Thread.java:844) NOTE: reproduce with: ant test -Dtestcase=ScheduledMaintenanceTriggerTest -Dtests.method=testInactiveShardCleanup -Dtests.seed=1D3ADDF6AE377902 -Dtests.slow=true -Dtests.badapples=true -Dtests.locale=ha -Dtests.timezone=America/Nome -Dtests.asserts=true -Dtests.file.encoding=US-ASCII [junit4] FAILURE 2.01s J0 | ScheduledMaintenanceTriggerTest.testInactiveShardCleanup <<< at __randomizedtesting.SeedInfo.seed([1D3ADDF6AE377902:161D84CF745E09]:0) [junit4] > at org.apache.solr.cloud.CloudTestUtils.waitForState(CloudTestUtils.java:70) [junit4] > at org.apache.solr.cloud.autoscaling.ScheduledMaintenanceTriggerTest.testInactiveShardCleanup(ScheduledMaintenanceTriggerTest.java:167) [junit4] > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) [junit4] > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) [junit4] > at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) [junit4] > at java.base/java.lang.reflect.Method.invoke(Method.java:564) [junit4] > at java.base/java.lang.Thread.run(Thread.java:844) [junit4] > Caused by: java.util.concurrent.TimeoutException: last state: DocCollection(ScheduledMaintenanceTriggerTest_collection1//clusterstate.json/6)={ was (Author: akjain): The excel sheet is pretty big, so not sure if pasting it here is good idea. You have good point about moving FSTs off-heap in the default codec as we can always preload mmap file during index open as demonstrated [here|https://www.elastic.co/guide/en/elasticsearch/reference/master/_pre_loading_data_into_the_file_system_cache.html] I ran the test suite and couple of tests seem to fail. Though, they don't seem to have anything to do with my change: [junit4] Tests with failures [seed: 1D3ADDF6AE377902]: [junit4] - org.apache.solr.cloud.autoscaling.ScheduledMaintenanceTriggerTest.testInactiveShardCleanup [junit4] - org.apache.solr.cloud.autoscaling.ScheduledTriggerTest.testTrigger [junit4] [junit4] [junit4] JVM J0: 1.40 .. 4359.18 = 4357.78s [junit4] JVM J1: 1.40 ..
[jira] [Comment Edited] (LUCENE-8635) Lazy loading Lucene FST offheap using mmap
[ https://issues.apache.org/jira/browse/LUCENE-8635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16740855#comment-16740855 ] Ankit Jain edited comment on LUCENE-8635 at 1/12/19 12:07 AM: -- The excel sheet is pretty big, so not sure if pasting it here is good idea. You have good point about moving FSTs off-heap in the default codec as we can always preload mmap file during index open as demonstrated [here|https://www.elastic.co/guide/en/elasticsearch/reference/master/_pre_loading_data_into_the_file_system_cache.html] I ran the test suite and couple of tests seem to fail. Though, they don't seem to have anything to do with my change: [junit4] Tests with failures [seed: 1D3ADDF6AE377902]: [junit4] - org.apache.solr.cloud.autoscaling.ScheduledMaintenanceTriggerTest.testInactiveShardCleanup [junit4] - org.apache.solr.cloud.autoscaling.ScheduledTriggerTest.testTrigger [junit4] [junit4] [junit4] JVM J0: 1.40 .. 4359.18 = 4357.78s [junit4] JVM J1: 1.40 .. 4359.35 = 4357.95s [junit4] JVM J2: 1.40 .. 4359.30 = 4357.90s [junit4] Execution time total: 1 hour 12 minutes 40 seconds [junit4] Tests summary: 833 suites (7 ignored), 4024 tests, 2 failures, 286 ignored (153 assumptions) Details for failing tests NOTE: reproduce with: ant test -Dtestcase=ScheduledTriggerTest -Dtests.method=testTrigger -Dtests.seed=1D3ADDF6AE377902 -Dtests.slow=true -Dtests.badapples=true -Dtests.locale=mr-IN -Dtests.timezone=America/St_Lucia -Dtests.asserts=true -Dtests.file.encoding=US-ASCII [junit4] FAILURE 9.03s J2 | ScheduledTriggerTest.testTrigger <<< [junit4] > Throwable #1: java.lang.AssertionError: expected:<3> but was:<2> [junit4] > at __randomizedtesting.SeedInfo.seed([1D3ADDF6AE377902:7EF1EB7437F80A2F]:0) [junit4] > at org.apache.solr.cloud.autoscaling.ScheduledTriggerTest.scheduledTriggerTest(ScheduledTriggerTest.java:113) [junit4] > at org.apache.solr.cloud.autoscaling.ScheduledTriggerTest.testTrigger(ScheduledTriggerTest.java:66) [junit4] > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) [junit4] > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) [junit4] > at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) [junit4] > at java.base/java.lang.reflect.Method.invoke(Method.java:564) [junit4] > at java.base/java.lang.Thread.run(Thread.java:844) NOTE: reproduce with: ant test -Dtestcase=ScheduledMaintenanceTriggerTest -Dtests.method=testInactiveShardCleanup -Dtests.seed=1D3ADDF6AE377902 -Dtests.slow=true -Dtests.badapples=true -Dtests.locale=ha -Dtests.timezone=America/Nome -Dtests.asserts=true -Dtests.file.encoding=US-ASCII [junit4] FAILURE 2.01s J0 | ScheduledMaintenanceTriggerTest.testInactiveShardCleanup <<< at __randomizedtesting.SeedInfo.seed([1D3ADDF6AE377902:161D84CF745E09]:0) [junit4] > at org.apache.solr.cloud.CloudTestUtils.waitForState(CloudTestUtils.java:70) [junit4] > at org.apache.solr.cloud.autoscaling.ScheduledMaintenanceTriggerTest.testInactiveShardCleanup(ScheduledMaintenanceTriggerTest.java:167) [junit4] > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) [junit4] > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) [junit4] > at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) [junit4] > at java.base/java.lang.reflect.Method.invoke(Method.java:564) [junit4] > at java.base/java.lang.Thread.run(Thread.java:844) [junit4] > Caused by: java.util.concurrent.TimeoutException: last state: DocCollection(ScheduledMaintenanceTriggerTest_collection1//clusterstate.json/6)={ was (Author: akjain): I ran the test suite and couple of tests seem to fail. Though, they don't seem to have anything to do with my change: [junit4] Tests with failures [seed: 1D3ADDF6AE377902]: [junit4] - org.apache.solr.cloud.autoscaling.ScheduledMaintenanceTriggerTest.testInactiveShardCleanup [junit4] - org.apache.solr.cloud.autoscaling.ScheduledTriggerTest.testTrigger [junit4] [junit4] [junit4] JVM J0: 1.40 .. 4359.18 = 4357.78s [junit4] JVM J1: 1.40 .. 4359.35 = 4357.95s [junit4] JVM J2: 1.40 .. 4359.30 = 4357.90s [junit4] Execution time total: 1 hour 12 minutes 40 seconds [junit4] Tests summary: 833 suites (7 ignored), 4024 tests, 2 failures, 286 ignored (153 assumptions) Details for failing tests NOTE: reproduce with: ant test