[jira] [Commented] (LUCENE-8653) Reverse FST storage so it can be read forward
[ https://issues.apache.org/jira/browse/LUCENE-8653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16750314#comment-16750314 ] Lucene/Solr QA commented on LUCENE-8653: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 39s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} Release audit (RAT) {color} | {color:green} 0m 37s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} Check forbidden APIs {color} | {color:green} 0m 32s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} Validate source patterns {color} | {color:red} 0m 32s{color} | {color:red} Validate source patterns validate-source-patterns failed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 14m 43s{color} | {color:green} core in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 21s{color} | {color:green} test-framework in the patch passed. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 20m 44s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | LUCENE-8653 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12955831/fst-reverse.patch | | Optional Tests | compile javac unit ratsources checkforbiddenapis validatesourcepatterns | | uname | Linux lucene1-us-west 4.4.0-137-generic #163~14.04.1-Ubuntu SMP Mon Sep 24 17:14:57 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | ant | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-LUCENE-Build/sourcedir/dev-tools/test-patch/lucene-solr-yetus-personality.sh | | git revision | master / 72a99e9 | | ant | version: Apache Ant(TM) version 1.9.3 compiled on July 24 2018 | | Default Java | 1.8.0_191 | | Validate source patterns | https://builds.apache.org/job/PreCommit-LUCENE-Build/154/artifact/out/patch-validate-source-patterns-root.txt | | Test Results | https://builds.apache.org/job/PreCommit-LUCENE-Build/154/testReport/ | | modules | C: lucene/core lucene/test-framework U: lucene | | Console output | https://builds.apache.org/job/PreCommit-LUCENE-Build/154/console | | Powered by | Apache Yetus 0.7.0 http://yetus.apache.org | This message was automatically generated. > Reverse FST storage so it can be read forward > - > > Key: LUCENE-8653 > URL: https://issues.apache.org/jira/browse/LUCENE-8653 > Project: Lucene - Core > Issue Type: Improvement > Components: core/FSTs >Reporter: Mike Sokolov >Priority: Major > Attachments: fst-reverse.patch > > > Discussion of keeping FST off-heap led to the idea of ensuring that FST's can > be read forward in order to be more cache-friendly and align better with > standard I/O practice. Today FSTs are read in reverse and this leads to some > awkwardness, and you can't use standard readers so the code can be confusing > to work with. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8653) Reverse FST storage so it can be read forward
[ https://issues.apache.org/jira/browse/LUCENE-8653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16748985#comment-16748985 ] Michael McCandless commented on LUCENE-8653: Impressive how simple this was! I think it's simpler to think about, reading the {{byte[]}} in forward order, and it ought to be a bit more cache friendly. I agree jumping between FST nodes is very random access, but e.g. at a given node as we scan the arcs looking for a match that would become sequential byte reads with this change. Curious the impact is neutral, but maybe if we combine this with LUCENE-8635 we can measure an impact? > Reverse FST storage so it can be read forward > - > > Key: LUCENE-8653 > URL: https://issues.apache.org/jira/browse/LUCENE-8653 > Project: Lucene - Core > Issue Type: Improvement > Components: core/FSTs >Reporter: Mike Sokolov >Priority: Major > Attachments: fst-reverse.patch > > > Discussion of keeping FST off-heap led to the idea of ensuring that FST's can > be read forward in order to be more cache-friendly and align better with > standard I/O practice. Today FSTs are read in reverse and this leads to some > awkwardness, and you can't use standard readers so the code can be confusing > to work with. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8653) Reverse FST storage so it can be read forward
[ https://issues.apache.org/jira/browse/LUCENE-8653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16748756#comment-16748756 ] Mike Sokolov commented on LUCENE-8653: -- The reverse reading is required because the FST serializes itself from an Object-heavy DAG of Nodes and Arcs into an array of bytes by traversing the DAG backwards, but writing forwards into the byte storage. And it optimizes straight-line sections of the DAG by eliminating the explicit pointers and just implicitly pointing to the (logically) next Node in the byte array, so "next" here means *at the next lower byte address*. We can eliminate this reversal by reversing the byte array after serialization and fixing-up the explicit pointers when we read them. We can't really fix them up in place without more major surgery because they are VInts. > Reverse FST storage so it can be read forward > - > > Key: LUCENE-8653 > URL: https://issues.apache.org/jira/browse/LUCENE-8653 > Project: Lucene - Core > Issue Type: Improvement > Components: core/FSTs >Reporter: Mike Sokolov >Priority: Major > > Discussion of keeping FST off-heap led to the idea of ensuring that FST's can > be read forward in order to be more cache-friendly and align better with > standard I/O practice. Today FSTs are read in reverse and this leads to some > awkwardness, and you can't use standard readers so the code can be confusing > to work with. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8653) Reverse FST storage so it can be read forward
[ https://issues.apache.org/jira/browse/LUCENE-8653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16748227#comment-16748227 ] Mike Sokolov commented on LUCENE-8653: -- Yeah, some initial measurements using luceneutil are showing pretty neutral impact. I am also thinking of looking at memory-starved conditions and see what happens there, and there is an idea that this could end up helping the off-heap use case in LUCENE-8635 > Reverse FST storage so it can be read forward > - > > Key: LUCENE-8653 > URL: https://issues.apache.org/jira/browse/LUCENE-8653 > Project: Lucene - Core > Issue Type: Improvement > Components: core/FSTs >Reporter: Mike Sokolov >Priority: Major > > Discussion of keeping FST off-heap led to the idea of ensuring that FST's can > be read forward in order to be more cache-friendly and align better with > standard I/O practice. Today FSTs are read in reverse and this leads to some > awkwardness, and you can't use standard readers so the code can be confusing > to work with. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8653) Reverse FST storage so it can be read forward
[ https://issues.apache.org/jira/browse/LUCENE-8653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16748209#comment-16748209 ] Dawid Weiss commented on LUCENE-8653: - The FSTs are by design really cache un-friendly. It's a graph that is traversed in a very irregular way. I don't think making it linear instead of reverse order will help much, but it's worth a shot. Also: Mike's "reverse" reader does have a reason (but I can't remember what it was off the top of my head). > Reverse FST storage so it can be read forward > - > > Key: LUCENE-8653 > URL: https://issues.apache.org/jira/browse/LUCENE-8653 > Project: Lucene - Core > Issue Type: Improvement > Components: core/FSTs >Reporter: Mike Sokolov >Priority: Major > > Discussion of keeping FST off-heap led to the idea of ensuring that FST's can > be read forward in order to be more cache-friendly and align better with > standard I/O practice. Today FSTs are read in reverse and this leads to some > awkwardness, and you can't use standard readers so the code can be confusing > to work with. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org