[jira] [Commented] (LUCENE-4675) remove *Ref.copy/append/grow
[ https://issues.apache.org/jira/browse/LUCENE-4675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13549654#comment-13549654 ] Uwe Schindler commented on LUCENE-4675: --- Strong +1 to make BytesRef a byte[] reference only. BytesRef is unfortunately a user-facing class in Lucene 4.x, so we have to look into this. I was also planning to fix this before 4.0, but we had no time. This was one of the last classes, Robert and I did not fix in the final cleanup before release, which is a pity. > remove *Ref.copy/append/grow > > > Key: LUCENE-4675 > URL: https://issues.apache.org/jira/browse/LUCENE-4675 > Project: Lucene - Core > Issue Type: Bug >Reporter: Robert Muir > > These methods are dangerous: > In general if we want a StringBuilder type class, then it should own the > array, and it can freely do allocation stuff etc. this is the only way to > make it safe. > Otherwise if we want a ByteBuffer type class, then its reference should be > immutable (the byte[]/offset/length should be final), and it should not have > allocation stuff. > BytesRef is none of these, its like a C pointer. Unfortunately lucene puts > these unsafe, dangerous, trappy APIs directly in front of the user. > What happens if i have a bug in my application and it accidentally mucks with > the term bytes returned by TermsEnum or the payloads from > DocsAndPositionsEnum? Will this get merged into a corrupt index? > I think as a start we should remove these copy/append/grow to minimize this > closer to a ref class (e.g. more like java.lang.ref and less like > stringbuilder). Nobody needs this stuff on bytesref, they can already operate > on the bytes directly. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4675) remove *Ref.copy/append/grow
[ https://issues.apache.org/jira/browse/LUCENE-4675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13549637#comment-13549637 ] Robert Muir commented on LUCENE-4675: - I dont think we should add more functionality to these *Ref classes: they have too many traps and bugs already. Less is more here. > remove *Ref.copy/append/grow > > > Key: LUCENE-4675 > URL: https://issues.apache.org/jira/browse/LUCENE-4675 > Project: Lucene - Core > Issue Type: Bug >Reporter: Robert Muir > > These methods are dangerous: > In general if we want a StringBuilder type class, then it should own the > array, and it can freely do allocation stuff etc. this is the only way to > make it safe. > Otherwise if we want a ByteBuffer type class, then its reference should be > immutable (the byte[]/offset/length should be final), and it should not have > allocation stuff. > BytesRef is none of these, its like a C pointer. Unfortunately lucene puts > these unsafe, dangerous, trappy APIs directly in front of the user. > What happens if i have a bug in my application and it accidentally mucks with > the term bytes returned by TermsEnum or the payloads from > DocsAndPositionsEnum? Will this get merged into a corrupt index? > I think as a start we should remove these copy/append/grow to minimize this > closer to a ref class (e.g. more like java.lang.ref and less like > stringbuilder). Nobody needs this stuff on bytesref, they can already operate > on the bytes directly. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4675) remove *Ref.copy/append/grow
[ https://issues.apache.org/jira/browse/LUCENE-4675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13549632#comment-13549632 ] Shai Erera commented on LUCENE-4675: bq. you can separately make your own BytesRefIterator class I can. I wanted to avoid additional object allocations, but such an Iterator class can have a reset(BytesRef) method which will update pos and upto members accordingly. I was thinking that an 'upto' index might be useful for others. For my purposes (see LUCENE-4620) I just use bytes.offset as 'pos' and compute an 'upto' and passes it along. I will think about the Iterator class though, perhaps it's not a bad idea. And maybe *Ref can have an iterator() method which returns the proper one ... or not. > remove *Ref.copy/append/grow > > > Key: LUCENE-4675 > URL: https://issues.apache.org/jira/browse/LUCENE-4675 > Project: Lucene - Core > Issue Type: Bug >Reporter: Robert Muir > > These methods are dangerous: > In general if we want a StringBuilder type class, then it should own the > array, and it can freely do allocation stuff etc. this is the only way to > make it safe. > Otherwise if we want a ByteBuffer type class, then its reference should be > immutable (the byte[]/offset/length should be final), and it should not have > allocation stuff. > BytesRef is none of these, its like a C pointer. Unfortunately lucene puts > these unsafe, dangerous, trappy APIs directly in front of the user. > What happens if i have a bug in my application and it accidentally mucks with > the term bytes returned by TermsEnum or the payloads from > DocsAndPositionsEnum? Will this get merged into a corrupt index? > I think as a start we should remove these copy/append/grow to minimize this > closer to a ref class (e.g. more like java.lang.ref and less like > stringbuilder). Nobody needs this stuff on bytesref, they can already operate > on the bytes directly. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4675) remove *Ref.copy/append/grow
[ https://issues.apache.org/jira/browse/LUCENE-4675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13549611#comment-13549611 ] Robert Muir commented on LUCENE-4675: - i dont think we need any additional members in this thing. what more does it need other than byte[], offset, length?! i want to remove the extraneous stuff. if you want to make an iterator, you can separately make your own BytesRefIterator class? > remove *Ref.copy/append/grow > > > Key: LUCENE-4675 > URL: https://issues.apache.org/jira/browse/LUCENE-4675 > Project: Lucene - Core > Issue Type: Bug >Reporter: Robert Muir > > These methods are dangerous: > In general if we want a StringBuilder type class, then it should own the > array, and it can freely do allocation stuff etc. this is the only way to > make it safe. > Otherwise if we want a ByteBuffer type class, then its reference should be > immutable (the byte[]/offset/length should be final), and it should not have > allocation stuff. > BytesRef is none of these, its like a C pointer. Unfortunately lucene puts > these unsafe, dangerous, trappy APIs directly in front of the user. > What happens if i have a bug in my application and it accidentally mucks with > the term bytes returned by TermsEnum or the payloads from > DocsAndPositionsEnum? Will this get merged into a corrupt index? > I think as a start we should remove these copy/append/grow to minimize this > closer to a ref class (e.g. more like java.lang.ref and less like > stringbuilder). Nobody needs this stuff on bytesref, they can already operate > on the bytes directly. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4675) remove *Ref.copy/append/grow
[ https://issues.apache.org/jira/browse/LUCENE-4675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13549598#comment-13549598 ] Shai Erera commented on LUCENE-4675: ok. While you're at it, what do you think about adding an 'upto' member for easier iteration on the bytes/ints/chars? (see my comment on LUCENE-4674) > remove *Ref.copy/append/grow > > > Key: LUCENE-4675 > URL: https://issues.apache.org/jira/browse/LUCENE-4675 > Project: Lucene - Core > Issue Type: Bug >Reporter: Robert Muir > > These methods are dangerous: > In general if we want a StringBuilder type class, then it should own the > array, and it can freely do allocation stuff etc. this is the only way to > make it safe. > Otherwise if we want a ByteBuffer type class, then its reference should be > immutable (the byte[]/offset/length should be final), and it should not have > allocation stuff. > BytesRef is none of these, its like a C pointer. Unfortunately lucene puts > these unsafe, dangerous, trappy APIs directly in front of the user. > What happens if i have a bug in my application and it accidentally mucks with > the term bytes returned by TermsEnum or the payloads from > DocsAndPositionsEnum? Will this get merged into a corrupt index? > I think as a start we should remove these copy/append/grow to minimize this > closer to a ref class (e.g. more like java.lang.ref and less like > stringbuilder). Nobody needs this stuff on bytesref, they can already operate > on the bytes directly. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4675) remove *Ref.copy/append/grow
[ https://issues.apache.org/jira/browse/LUCENE-4675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13549590#comment-13549590 ] Robert Muir commented on LUCENE-4675: - I'm proposing removing these 3 methods from BytesRef itself, thats all. The guy from the outside knows what he can do: he knows if the bytes actually point to a slice of a PagedBytes (grow is actually senseless here!), or just a simple byte[], or whatever. He doesn't need BytesRef itself to do these things. So he can then change the ref to point at a different slice, or different byte[] alltogether, or whatever. > remove *Ref.copy/append/grow > > > Key: LUCENE-4675 > URL: https://issues.apache.org/jira/browse/LUCENE-4675 > Project: Lucene - Core > Issue Type: Bug >Reporter: Robert Muir > > These methods are dangerous: > In general if we want a StringBuilder type class, then it should own the > array, and it can freely do allocation stuff etc. this is the only way to > make it safe. > Otherwise if we want a ByteBuffer type class, then its reference should be > immutable (the byte[]/offset/length should be final), and it should not have > allocation stuff. > BytesRef is none of these, its like a C pointer. Unfortunately lucene puts > these unsafe, dangerous, trappy APIs directly in front of the user. > What happens if i have a bug in my application and it accidentally mucks with > the term bytes returned by TermsEnum or the payloads from > DocsAndPositionsEnum? Will this get merged into a corrupt index? > I think as a start we should remove these copy/append/grow to minimize this > closer to a ref class (e.g. more like java.lang.ref and less like > stringbuilder). Nobody needs this stuff on bytesref, they can already operate > on the bytes directly. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4675) remove *Ref.copy/append/grow
[ https://issues.apache.org/jira/browse/LUCENE-4675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13549589#comment-13549589 ] Shai Erera commented on LUCENE-4675: I kinda like grow(). Will I be able to grow() the buffer from the outside if you remove it? I.e. will the byte[] not be final? > remove *Ref.copy/append/grow > > > Key: LUCENE-4675 > URL: https://issues.apache.org/jira/browse/LUCENE-4675 > Project: Lucene - Core > Issue Type: Bug >Reporter: Robert Muir > > These methods are dangerous: > In general if we want a StringBuilder type class, then it should own the > array, and it can freely do allocation stuff etc. this is the only way to > make it safe. > Otherwise if we want a ByteBuffer type class, then its reference should be > immutable (the byte[]/offset/length should be final), and it should not have > allocation stuff. > BytesRef is none of these, its like a C pointer. Unfortunately lucene puts > these unsafe, dangerous, trappy APIs directly in front of the user. > What happens if i have a bug in my application and it accidentally mucks with > the term bytes returned by TermsEnum or the payloads from > DocsAndPositionsEnum? Will this get merged into a corrupt index? > I think as a start we should remove these copy/append/grow to minimize this > closer to a ref class (e.g. more like java.lang.ref and less like > stringbuilder). Nobody needs this stuff on bytesref, they can already operate > on the bytes directly. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org