[jira] [Commented] (LUCENE-8653) Reverse FST storage so it can be read forward

2019-01-23 Thread Lucene/Solr QA (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16750314#comment-16750314
 ] 

Lucene/Solr QA commented on LUCENE-8653:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
39s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} Release audit (RAT) {color} | 
{color:green}  0m 37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} Check forbidden APIs {color} | 
{color:green}  0m 32s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} Validate source patterns {color} | 
{color:red}  0m 32s{color} | {color:red} Validate source patterns 
validate-source-patterns failed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 14m 
43s{color} | {color:green} core in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
21s{color} | {color:green} test-framework in the patch passed. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 20m 44s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | LUCENE-8653 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12955831/fst-reverse.patch |
| Optional Tests |  compile  javac  unit  ratsources  checkforbiddenapis  
validatesourcepatterns  |
| uname | Linux lucene1-us-west 4.4.0-137-generic #163~14.04.1-Ubuntu SMP Mon 
Sep 24 17:14:57 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | ant |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-LUCENE-Build/sourcedir/dev-tools/test-patch/lucene-solr-yetus-personality.sh
 |
| git revision | master / 72a99e9 |
| ant | version: Apache Ant(TM) version 1.9.3 compiled on July 24 2018 |
| Default Java | 1.8.0_191 |
| Validate source patterns | 
https://builds.apache.org/job/PreCommit-LUCENE-Build/154/artifact/out/patch-validate-source-patterns-root.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-LUCENE-Build/154/testReport/ |
| modules | C: lucene/core lucene/test-framework U: lucene |
| Console output | 
https://builds.apache.org/job/PreCommit-LUCENE-Build/154/console |
| Powered by | Apache Yetus 0.7.0   http://yetus.apache.org |


This message was automatically generated.



> Reverse FST storage so it can be read forward
> -
>
> Key: LUCENE-8653
> URL: https://issues.apache.org/jira/browse/LUCENE-8653
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/FSTs
>Reporter: Mike Sokolov
>Priority: Major
> Attachments: fst-reverse.patch
>
>
> Discussion of keeping FST off-heap led to the idea of ensuring that FST's can 
> be read forward in order to be more cache-friendly and align better with 
> standard I/O practice. Today FSTs are read in reverse and this leads to some 
> awkwardness, and you can't use standard readers so the code can be confusing 
> to work with.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8653) Reverse FST storage so it can be read forward

2019-01-22 Thread Michael McCandless (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16748985#comment-16748985
 ] 

Michael McCandless commented on LUCENE-8653:


Impressive how simple this was!  I think it's simpler to think about, reading 
the {{byte[]}} in forward order, and it ought to be a bit more cache friendly.  
I agree jumping between FST nodes is very random access, but e.g. at a given 
node as we scan the arcs looking for a match that would become sequential byte 
reads with this change.  Curious the impact is neutral, but maybe if we combine 
this with LUCENE-8635 we can measure an impact?

> Reverse FST storage so it can be read forward
> -
>
> Key: LUCENE-8653
> URL: https://issues.apache.org/jira/browse/LUCENE-8653
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/FSTs
>Reporter: Mike Sokolov
>Priority: Major
> Attachments: fst-reverse.patch
>
>
> Discussion of keeping FST off-heap led to the idea of ensuring that FST's can 
> be read forward in order to be more cache-friendly and align better with 
> standard I/O practice. Today FSTs are read in reverse and this leads to some 
> awkwardness, and you can't use standard readers so the code can be confusing 
> to work with.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8653) Reverse FST storage so it can be read forward

2019-01-22 Thread Mike Sokolov (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16748756#comment-16748756
 ] 

Mike Sokolov commented on LUCENE-8653:
--

The reverse reading is required because the FST serializes itself from an 
Object-heavy DAG of Nodes and Arcs into an array of bytes by traversing the DAG 
backwards, but writing forwards into the byte storage. And it optimizes 
straight-line sections of the DAG by eliminating the explicit pointers and just 
implicitly pointing to the (logically) next Node in the byte array, so "next" 
here means *at the next lower byte address*. We can eliminate this reversal by 
reversing the byte array after serialization and fixing-up the explicit 
pointers when we read them. We can't really fix them up in place without more 
major surgery because they are VInts.

> Reverse FST storage so it can be read forward
> -
>
> Key: LUCENE-8653
> URL: https://issues.apache.org/jira/browse/LUCENE-8653
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/FSTs
>Reporter: Mike Sokolov
>Priority: Major
>
> Discussion of keeping FST off-heap led to the idea of ensuring that FST's can 
> be read forward in order to be more cache-friendly and align better with 
> standard I/O practice. Today FSTs are read in reverse and this leads to some 
> awkwardness, and you can't use standard readers so the code can be confusing 
> to work with.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8653) Reverse FST storage so it can be read forward

2019-01-21 Thread Mike Sokolov (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16748227#comment-16748227
 ] 

Mike Sokolov commented on LUCENE-8653:
--

Yeah, some initial measurements using luceneutil are showing pretty neutral 
impact. I am also thinking of looking at memory-starved conditions and see what 
happens there, and there is an idea that this could end up helping the off-heap 
use case in LUCENE-8635

 

> Reverse FST storage so it can be read forward
> -
>
> Key: LUCENE-8653
> URL: https://issues.apache.org/jira/browse/LUCENE-8653
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/FSTs
>Reporter: Mike Sokolov
>Priority: Major
>
> Discussion of keeping FST off-heap led to the idea of ensuring that FST's can 
> be read forward in order to be more cache-friendly and align better with 
> standard I/O practice. Today FSTs are read in reverse and this leads to some 
> awkwardness, and you can't use standard readers so the code can be confusing 
> to work with.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8653) Reverse FST storage so it can be read forward

2019-01-21 Thread Dawid Weiss (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16748209#comment-16748209
 ] 

Dawid Weiss commented on LUCENE-8653:
-

The FSTs are by design really cache un-friendly. It's a graph that is traversed 
in a very irregular way. I don't think making it linear instead of reverse 
order will help much, but it's worth a shot. Also: Mike's "reverse" reader does 
have a reason (but I can't remember what it was off the top of my head).

> Reverse FST storage so it can be read forward
> -
>
> Key: LUCENE-8653
> URL: https://issues.apache.org/jira/browse/LUCENE-8653
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/FSTs
>Reporter: Mike Sokolov
>Priority: Major
>
> Discussion of keeping FST off-heap led to the idea of ensuring that FST's can 
> be read forward in order to be more cache-friendly and align better with 
> standard I/O practice. Today FSTs are read in reverse and this leads to some 
> awkwardness, and you can't use standard readers so the code can be confusing 
> to work with.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org