[jira] [Commented] (LUCENE-5152) Lucene FST is not immutable

2013-08-15 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13741604#comment-13741604
 ] 

ASF subversion and git services commented on LUCENE-5152:
-

Commit 1514520 from [~mikemccand] in branch 'dev/trunk'
[ https://svn.apache.org/r1514520 ]

LUCENE-5152: make assert less costly

 Lucene FST is not immutable
 ---

 Key: LUCENE-5152
 URL: https://issues.apache.org/jira/browse/LUCENE-5152
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/FSTs
Affects Versions: 4.4
Reporter: Simon Willnauer
Priority: Blocker
 Fix For: 5.0, 4.5

 Attachments: LUCENE-5152.patch, LUCENE-5152.patch, LUCENE-5152.patch, 
 LUCENE-5152.patch


 a spinnoff from LUCENE-5120 where the analyzing suggester modified a returned 
 output from and FST (BytesRef) which caused sideffects in later execution. 
 I added an assertion into the FST that checks if a cached root arc is 
 modified and in-fact this happens for instance in our MemoryPostingsFormat 
 and I bet we find more places. We need to think about how to make this less 
 trappy since it can cause bugs that are super hard to find.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5152) Lucene FST is not immutable

2013-08-15 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13741609#comment-13741609
 ] 

ASF subversion and git services commented on LUCENE-5152:
-

Commit 1514521 from [~mikemccand] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1514521 ]

LUCENE-5152: make assert less costly

 Lucene FST is not immutable
 ---

 Key: LUCENE-5152
 URL: https://issues.apache.org/jira/browse/LUCENE-5152
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/FSTs
Affects Versions: 4.4
Reporter: Simon Willnauer
Priority: Blocker
 Fix For: 5.0, 4.5

 Attachments: LUCENE-5152.patch, LUCENE-5152.patch, LUCENE-5152.patch, 
 LUCENE-5152.patch


 a spinnoff from LUCENE-5120 where the analyzing suggester modified a returned 
 output from and FST (BytesRef) which caused sideffects in later execution. 
 I added an assertion into the FST that checks if a cached root arc is 
 modified and in-fact this happens for instance in our MemoryPostingsFormat 
 and I bet we find more places. We need to think about how to make this less 
 trappy since it can cause bugs that are super hard to find.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5152) Lucene FST is not immutable

2013-08-14 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739809#comment-13739809
 ] 

Michael McCandless commented on LUCENE-5152:


I plan to commit that last patch soon...

 Lucene FST is not immutable
 ---

 Key: LUCENE-5152
 URL: https://issues.apache.org/jira/browse/LUCENE-5152
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/FSTs
Affects Versions: 4.4
Reporter: Simon Willnauer
Priority: Blocker
 Fix For: 5.0, 4.5

 Attachments: LUCENE-5152.patch, LUCENE-5152.patch, LUCENE-5152.patch, 
 LUCENE-5152.patch


 a spinnoff from LUCENE-5120 where the analyzing suggester modified a returned 
 output from and FST (BytesRef) which caused sideffects in later execution. 
 I added an assertion into the FST that checks if a cached root arc is 
 modified and in-fact this happens for instance in our MemoryPostingsFormat 
 and I bet we find more places. We need to think about how to make this less 
 trappy since it can cause bugs that are super hard to find.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5152) Lucene FST is not immutable

2013-08-14 Thread Simon Willnauer (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739912#comment-13739912
 ] 

Simon Willnauer commented on LUCENE-5152:
-

sorry for the late reply busy times... +1 to commit

 Lucene FST is not immutable
 ---

 Key: LUCENE-5152
 URL: https://issues.apache.org/jira/browse/LUCENE-5152
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/FSTs
Affects Versions: 4.4
Reporter: Simon Willnauer
Priority: Blocker
 Fix For: 5.0, 4.5

 Attachments: LUCENE-5152.patch, LUCENE-5152.patch, LUCENE-5152.patch, 
 LUCENE-5152.patch


 a spinnoff from LUCENE-5120 where the analyzing suggester modified a returned 
 output from and FST (BytesRef) which caused sideffects in later execution. 
 I added an assertion into the FST that checks if a cached root arc is 
 modified and in-fact this happens for instance in our MemoryPostingsFormat 
 and I bet we find more places. We need to think about how to make this less 
 trappy since it can cause bugs that are super hard to find.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5152) Lucene FST is not immutable

2013-08-08 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13733454#comment-13733454
 ] 

Michael McCandless commented on LUCENE-5152:


bq. can you elaborate what you are concerned about?

I'm worried about the O(N^2) cost of the assert: for every arc (single
byte of each term in a seekExact) we are iterating over all root arcs
(up to 256 arcs) in this assert.

bq. findTargetArc is the only place where we actually use this cache?

Ahh that's true, I hadn't realized that.

Maybe, instead, we can move the assert just inside the if that
actually uses the cached arcs?  Ie, put it here:

{code}
  if (follow.target == startNode  labelToMatch  cachedRootArcs.length) {
assert assertRootArcs();
...
  }
{code}

This would address my concern: the cost becomes O(N) not O(N^2).  And
the coverage is the same?


 Lucene FST is not immutable
 ---

 Key: LUCENE-5152
 URL: https://issues.apache.org/jira/browse/LUCENE-5152
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/FSTs
Affects Versions: 4.4
Reporter: Simon Willnauer
Priority: Blocker
 Fix For: 5.0, 4.5

 Attachments: LUCENE-5152.patch, LUCENE-5152.patch, LUCENE-5152.patch


 a spinnoff from LUCENE-5120 where the analyzing suggester modified a returned 
 output from and FST (BytesRef) which caused sideffects in later execution. 
 I added an assertion into the FST that checks if a cached root arc is 
 modified and in-fact this happens for instance in our MemoryPostingsFormat 
 and I bet we find more places. We need to think about how to make this less 
 trappy since it can cause bugs that are super hard to find.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5152) Lucene FST is not immutable

2013-08-08 Thread Simon Willnauer (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13733474#comment-13733474
 ] 

Simon Willnauer commented on LUCENE-5152:
-

bq. This would address my concern: the cost becomes O(N) not O(N^2). And the 
coverage is the same?

The problem here is that we really need to check after we returned from the 
cache and that might be the case only once in a certain test. Yet, I think it's 
OK to do it there. I still don't get what you are concerned of we only have -ea 
in tests and the tests don't seem to be any slower? Can you elaborate what you 
are afraid of?

 Lucene FST is not immutable
 ---

 Key: LUCENE-5152
 URL: https://issues.apache.org/jira/browse/LUCENE-5152
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/FSTs
Affects Versions: 4.4
Reporter: Simon Willnauer
Priority: Blocker
 Fix For: 5.0, 4.5

 Attachments: LUCENE-5152.patch, LUCENE-5152.patch, LUCENE-5152.patch


 a spinnoff from LUCENE-5120 where the analyzing suggester modified a returned 
 output from and FST (BytesRef) which caused sideffects in later execution. 
 I added an assertion into the FST that checks if a cached root arc is 
 modified and in-fact this happens for instance in our MemoryPostingsFormat 
 and I bet we find more places. We need to think about how to make this less 
 trappy since it can cause bugs that are super hard to find.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5152) Lucene FST is not immutable

2013-08-08 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13733598#comment-13733598
 ] 

Michael McCandless commented on LUCENE-5152:


bq. Can you elaborate what you are afraid of?

In general I think it's bad if an assert changes too much how the code
would run without asserts.  E.g., maybe this O(N^2) assert alters how
threads are scheduled and changes how / whether an issue appears in
practice.

Similarly, if a user is having trouble, I'll recommend turning on
asserts to see if one trips, but if this causes a change in how the
code runs then this can change whether the issue reproduces.

I also just don't like O(N^2) code, even when it's under an assert :)

I think asserts should minimize their impact to the real code when
possible, and it certainly seems possible in this case.

Separately, we really should run our tests w/o asserts, too, since
this is how our users typically run (I know some tests fail if
assertions are off ... we'd have to fix them).  What if we accidentally
commit real code behind an assert?  Our tests wouldn't catch it ...


 Lucene FST is not immutable
 ---

 Key: LUCENE-5152
 URL: https://issues.apache.org/jira/browse/LUCENE-5152
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/FSTs
Affects Versions: 4.4
Reporter: Simon Willnauer
Priority: Blocker
 Fix For: 5.0, 4.5

 Attachments: LUCENE-5152.patch, LUCENE-5152.patch, LUCENE-5152.patch


 a spinnoff from LUCENE-5120 where the analyzing suggester modified a returned 
 output from and FST (BytesRef) which caused sideffects in later execution. 
 I added an assertion into the FST that checks if a cached root arc is 
 modified and in-fact this happens for instance in our MemoryPostingsFormat 
 and I bet we find more places. We need to think about how to make this less 
 trappy since it can cause bugs that are super hard to find.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5152) Lucene FST is not immutable

2013-08-07 Thread Simon Willnauer (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13732088#comment-13732088
 ] 

Simon Willnauer commented on LUCENE-5152:
-

{quote}In fact, I think findTargetArc isn't great in this regard; e.g. I
think MemoryPF only uses this API if the caller calls seekExact? So I
think the current location of the assert is both more costly and lower
coverage than if we moved it to FST.getBytesReader.{quote}

I am not sure if that is true actually since findTargetArc is the only place 
where we actually use this cache? I mean I don't wanna be in the way of moving 
it but what kind of message are we sending here. We don't guarantee that 
asserts are fast but they won't slow you down massively? I mean we can call 
this assert in the getBytesReader as well to increase coverage, can you 
elaborate what you are concerned about?


 Lucene FST is not immutable
 ---

 Key: LUCENE-5152
 URL: https://issues.apache.org/jira/browse/LUCENE-5152
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/FSTs
Affects Versions: 4.4
Reporter: Simon Willnauer
Priority: Blocker
 Fix For: 5.0, 4.5

 Attachments: LUCENE-5152.patch, LUCENE-5152.patch, LUCENE-5152.patch


 a spinnoff from LUCENE-5120 where the analyzing suggester modified a returned 
 output from and FST (BytesRef) which caused sideffects in later execution. 
 I added an assertion into the FST that checks if a cached root arc is 
 modified and in-fact this happens for instance in our MemoryPostingsFormat 
 and I bet we find more places. We need to think about how to make this less 
 trappy since it can cause bugs that are super hard to find.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5152) Lucene FST is not immutable

2013-08-02 Thread David Smiley (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13727927#comment-13727927
 ] 

David Smiley commented on LUCENE-5152:
--

I've been bitten by bugs in my code related to sharing BytesRefs.  It's not 
clear in the APIs who owns a BytesRef received from a method call.  In other 
words, can a recipient know wether it can safely cache it for use later or 
wether it's not safe because it could change suddenly later.  It's not even 
just that since there are two levels of ownership, the BytesRef instance and 
then the underlying char array.  This could be partially addressed with more 
documentation; I did that in my own code but that didn't stop another bug.  

I have a rough idea in which a BytesRef has a boolean flag or two to indicate 
how share-able it is, and then a conditional clone method that either returns 
the current instance or returns shallow copy, or does a deep copy depending on 
those flags.  I dunno, just a half-baked idea.

 Lucene FST is not immutable
 ---

 Key: LUCENE-5152
 URL: https://issues.apache.org/jira/browse/LUCENE-5152
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/FSTs
Affects Versions: 4.4
Reporter: Simon Willnauer
Priority: Blocker
 Fix For: 5.0, 4.5

 Attachments: LUCENE-5152.patch, LUCENE-5152.patch, LUCENE-5152.patch


 a spinnoff from LUCENE-5120 where the analyzing suggester modified a returned 
 output from and FST (BytesRef) which caused sideffects in later execution. 
 I added an assertion into the FST that checks if a cached root arc is 
 modified and in-fact this happens for instance in our MemoryPostingsFormat 
 and I bet we find more places. We need to think about how to make this less 
 trappy since it can cause bugs that are super hard to find.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5152) Lucene FST is not immutable

2013-08-02 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13727956#comment-13727956
 ] 

ASF subversion and git services commented on LUCENE-5152:
-

Commit 1509805 from [~simonw] in branch 'dev/trunk'
[ https://svn.apache.org/r1509805 ]

LUCENE-5152: Fix MemoryPostingsFormat to not modify borrowed BytesRef from 
FSTEnum

 Lucene FST is not immutable
 ---

 Key: LUCENE-5152
 URL: https://issues.apache.org/jira/browse/LUCENE-5152
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/FSTs
Affects Versions: 4.4
Reporter: Simon Willnauer
Priority: Blocker
 Fix For: 5.0, 4.5

 Attachments: LUCENE-5152.patch, LUCENE-5152.patch, LUCENE-5152.patch


 a spinnoff from LUCENE-5120 where the analyzing suggester modified a returned 
 output from and FST (BytesRef) which caused sideffects in later execution. 
 I added an assertion into the FST that checks if a cached root arc is 
 modified and in-fact this happens for instance in our MemoryPostingsFormat 
 and I bet we find more places. We need to think about how to make this less 
 trappy since it can cause bugs that are super hard to find.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5152) Lucene FST is not immutable

2013-08-02 Thread Simon Willnauer (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13727958#comment-13727958
 ] 

Simon Willnauer commented on LUCENE-5152:
-

I committed the latest patch to get the assert in and fix memory postings 
format. I will keep this issue open for further discussions

 Lucene FST is not immutable
 ---

 Key: LUCENE-5152
 URL: https://issues.apache.org/jira/browse/LUCENE-5152
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/FSTs
Affects Versions: 4.4
Reporter: Simon Willnauer
Priority: Blocker
 Fix For: 5.0, 4.5

 Attachments: LUCENE-5152.patch, LUCENE-5152.patch, LUCENE-5152.patch


 a spinnoff from LUCENE-5120 where the analyzing suggester modified a returned 
 output from and FST (BytesRef) which caused sideffects in later execution. 
 I added an assertion into the FST that checks if a cached root arc is 
 modified and in-fact this happens for instance in our MemoryPostingsFormat 
 and I bet we find more places. We need to think about how to make this less 
 trappy since it can cause bugs that are super hard to find.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5152) Lucene FST is not immutable

2013-08-02 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13727963#comment-13727963
 ] 

ASF subversion and git services commented on LUCENE-5152:
-

Commit 1509812 from [~simonw] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1509812 ]

LUCENE-5152: Fix MemoryPostingsFormat to not modify borrowed BytesRef from 
FSTEnum

 Lucene FST is not immutable
 ---

 Key: LUCENE-5152
 URL: https://issues.apache.org/jira/browse/LUCENE-5152
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/FSTs
Affects Versions: 4.4
Reporter: Simon Willnauer
Priority: Blocker
 Fix For: 5.0, 4.5

 Attachments: LUCENE-5152.patch, LUCENE-5152.patch, LUCENE-5152.patch


 a spinnoff from LUCENE-5120 where the analyzing suggester modified a returned 
 output from and FST (BytesRef) which caused sideffects in later execution. 
 I added an assertion into the FST that checks if a cached root arc is 
 modified and in-fact this happens for instance in our MemoryPostingsFormat 
 and I bet we find more places. We need to think about how to make this less 
 trappy since it can cause bugs that are super hard to find.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5152) Lucene FST is not immutable

2013-08-02 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13727986#comment-13727986
 ] 

Michael McCandless commented on LUCENE-5152:


I know we make no claims about performance when assertions are enabled...

But it seems like this commit could substantially slow things down, since we 
call assertRootArcs, which is O(N) where N = number of root arcs, for every 
call to findTargetArc = likely called in hotspots.

Maybe, we could move the call to e.g. FST.getBytesReader?

 Lucene FST is not immutable
 ---

 Key: LUCENE-5152
 URL: https://issues.apache.org/jira/browse/LUCENE-5152
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/FSTs
Affects Versions: 4.4
Reporter: Simon Willnauer
Priority: Blocker
 Fix For: 5.0, 4.5

 Attachments: LUCENE-5152.patch, LUCENE-5152.patch, LUCENE-5152.patch


 a spinnoff from LUCENE-5120 where the analyzing suggester modified a returned 
 output from and FST (BytesRef) which caused sideffects in later execution. 
 I added an assertion into the FST that checks if a cached root arc is 
 modified and in-fact this happens for instance in our MemoryPostingsFormat 
 and I bet we find more places. We need to think about how to make this less 
 trappy since it can cause bugs that are super hard to find.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5152) Lucene FST is not immutable

2013-08-02 Thread Simon Willnauer (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13727993#comment-13727993
 ] 

Simon Willnauer commented on LUCENE-5152:
-

how is getBytesReader related to the root arcs? unless this slows things down 
when assertions are not enabled I'd vote for not moving it. This is a very 
trappy thing and we should catch any violation IMO very quickly.

 Lucene FST is not immutable
 ---

 Key: LUCENE-5152
 URL: https://issues.apache.org/jira/browse/LUCENE-5152
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/FSTs
Affects Versions: 4.4
Reporter: Simon Willnauer
Priority: Blocker
 Fix For: 5.0, 4.5

 Attachments: LUCENE-5152.patch, LUCENE-5152.patch, LUCENE-5152.patch


 a spinnoff from LUCENE-5120 where the analyzing suggester modified a returned 
 output from and FST (BytesRef) which caused sideffects in later execution. 
 I added an assertion into the FST that checks if a cached root arc is 
 modified and in-fact this happens for instance in our MemoryPostingsFormat 
 and I bet we find more places. We need to think about how to make this less 
 trappy since it can cause bugs that are super hard to find.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5152) Lucene FST is not immutable

2013-08-02 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13728317#comment-13728317
 ] 

Michael McCandless commented on LUCENE-5152:


bq. how is getBytesReader related to the root arcs?

Well, anything that works with the FST APIs needs to call
getBytesReader first, e.g. MemoryPF does this every time you pull a
TermsEnum from it.

bq. This is a very trappy thing and we should catch any violation IMO very 
quickly.

I agree it's trappy and it's great to add this check.

I'm simply proposing moving it to less of a hot-spot, and I don't
think this will affect how quickly we catch violations but should
reduce the cost of this added assertion.

In fact, I think findTargetArc isn't great in this regard; e.g. I
think MemoryPF only uses this API if the caller calls seekExact?  So I
think the current location of the assert is both more costly and lower
coverage than if we moved it to FST.getBytesReader.


 Lucene FST is not immutable
 ---

 Key: LUCENE-5152
 URL: https://issues.apache.org/jira/browse/LUCENE-5152
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/FSTs
Affects Versions: 4.4
Reporter: Simon Willnauer
Priority: Blocker
 Fix For: 5.0, 4.5

 Attachments: LUCENE-5152.patch, LUCENE-5152.patch, LUCENE-5152.patch


 a spinnoff from LUCENE-5120 where the analyzing suggester modified a returned 
 output from and FST (BytesRef) which caused sideffects in later execution. 
 I added an assertion into the FST that checks if a cached root arc is 
 modified and in-fact this happens for instance in our MemoryPostingsFormat 
 and I bet we find more places. We need to think about how to make this less 
 trappy since it can cause bugs that are super hard to find.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5152) Lucene FST is not immutable

2013-08-01 Thread Simon Willnauer (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726847#comment-13726847
 ] 

Simon Willnauer commented on LUCENE-5152:
-

bq. Should we clone payload bytes in the postings lists too? what about term 
dictionaries?
I agree we can be less conservative here and just use the payload and copy it 
into a new BytesRef or whatever is needed. I will bring up a new patch.

bq. At some point then BytesRef is useless as a reference class because of a 
few bad apples trying to use it as a ByteBuffer. Ideally we would remove code 
that abuses BytesRef as a ByteBuffer instead.

agreed again. We just need to make sure that we have asserts in place that 
check for that. 

bq. I don't mean to pick on your issue Simon, and it doesnt mean I object to 
the patch (though I wonder about performance implications), I just see this as 
one of many in a larger issue.

no worries. I am really concerned about this since it took me forever to figure 
out the problems this caused. I just wanna have an infra in place that catches 
those problems. I am more concerned about users that get bitten by this. I 
agree we should figure out the bigger problem eventually but lets make sure 
that we fix the bad apples first

 Lucene FST is not immutable
 ---

 Key: LUCENE-5152
 URL: https://issues.apache.org/jira/browse/LUCENE-5152
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/FSTs
Affects Versions: 4.4
Reporter: Simon Willnauer
Priority: Blocker
 Fix For: 5.0, 4.5

 Attachments: LUCENE-5152.patch, LUCENE-5152.patch


 a spinnoff from LUCENE-5120 where the analyzing suggester modified a returned 
 output from and FST (BytesRef) which caused sideffects in later execution. 
 I added an assertion into the FST that checks if a cached root arc is 
 modified and in-fact this happens for instance in our MemoryPostingsFormat 
 and I bet we find more places. We need to think about how to make this less 
 trappy since it can cause bugs that are super hard to find.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org