[jira] [Commented] (LUCENE-4674) Consistently set offset=0 in BytesRef.copyBytes
[ https://issues.apache.org/jira/browse/LUCENE-4674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13549533#comment-13549533 ] Robert Muir commented on LUCENE-4674: - I dont really agree (i dont think this class should be treated like stringbuffer). changing offset to 0 is fine when we make a new array: otherwise it is definitely and 100% certainly NOT OK as we may overwrite unrelated data. Consistently set offset=0 in BytesRef.copyBytes --- Key: LUCENE-4674 URL: https://issues.apache.org/jira/browse/LUCENE-4674 Project: Lucene - Core Issue Type: Task Reporter: Adrien Grand Assignee: Adrien Grand Priority: Minor Attachments: LUCENE-4674.patch BytesRef.copyBytes(BytesRef other) has two branches: - either the destination array is large enough and it will copy bytes after offset, - or it needs to resize and in that case it will set offset = 0. I think this method should always set offset = 0 for consistency, and to avoid resizing when other.length is larger than this.bytes.length - this.offset but smaller than this.bytes.length. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4674) Consistently set offset=0 in BytesRef.copyBytes
[ https://issues.apache.org/jira/browse/LUCENE-4674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13549534#comment-13549534 ] Robert Muir commented on LUCENE-4674: - moreover, any proposed changes here should also include the changes to IntsRef, LongsRef, CharsRef, and so on before even being considered. Otherwise the apis just get out of wack. Maybe we should just seriously consider just switching to java.nio.Buffer. Consistently set offset=0 in BytesRef.copyBytes --- Key: LUCENE-4674 URL: https://issues.apache.org/jira/browse/LUCENE-4674 Project: Lucene - Core Issue Type: Task Reporter: Adrien Grand Assignee: Adrien Grand Priority: Minor Attachments: LUCENE-4674.patch BytesRef.copyBytes(BytesRef other) has two branches: - either the destination array is large enough and it will copy bytes after offset, - or it needs to resize and in that case it will set offset = 0. I think this method should always set offset = 0 for consistency, and to avoid resizing when other.length is larger than this.bytes.length - this.offset but smaller than this.bytes.length. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4674) Consistently set offset=0 in BytesRef.copyBytes
[ https://issues.apache.org/jira/browse/LUCENE-4674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13549536#comment-13549536 ] Uwe Schindler commented on LUCENE-4674: --- I agree with Robert. We had BytesRef and CharsRef in the past doing that stuff. But as the name of the class is *Ref not *Buffer, it should only hold a reference to a byte[] and not change it or grow it. Esspecially it should not change offset. This is risky, if you get a BytesRef that points to some slice in a larger buffer and you suddenly resize it, invalidating content that might be needed by other stuff (e.g. while iterating terms, the previous/next term gets corrupted). I would in any case favour to use ByteBuffer instead of this unsafe and inncomplete duplicate. BytesRef is for user-facing APIs a mess. Consistently set offset=0 in BytesRef.copyBytes --- Key: LUCENE-4674 URL: https://issues.apache.org/jira/browse/LUCENE-4674 Project: Lucene - Core Issue Type: Task Reporter: Adrien Grand Assignee: Adrien Grand Priority: Minor Attachments: LUCENE-4674.patch BytesRef.copyBytes(BytesRef other) has two branches: - either the destination array is large enough and it will copy bytes after offset, - or it needs to resize and in that case it will set offset = 0. I think this method should always set offset = 0 for consistency, and to avoid resizing when other.length is larger than this.bytes.length - this.offset but smaller than this.bytes.length. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4674) Consistently set offset=0 in BytesRef.copyBytes
[ https://issues.apache.org/jira/browse/LUCENE-4674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13549538#comment-13549538 ] Adrien Grand commented on LUCENE-4674: -- I still find confusing that we are allowed to write past offset + length but not before offset. Switching to the java.nio buffers sounds good. Consistently set offset=0 in BytesRef.copyBytes --- Key: LUCENE-4674 URL: https://issues.apache.org/jira/browse/LUCENE-4674 Project: Lucene - Core Issue Type: Task Reporter: Adrien Grand Assignee: Adrien Grand Priority: Minor Attachments: LUCENE-4674.patch BytesRef.copyBytes(BytesRef other) has two branches: - either the destination array is large enough and it will copy bytes after offset, - or it needs to resize and in that case it will set offset = 0. I think this method should always set offset = 0 for consistency, and to avoid resizing when other.length is larger than this.bytes.length - this.offset but smaller than this.bytes.length. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4674) Consistently set offset=0 in BytesRef.copyBytes
[ https://issues.apache.org/jira/browse/LUCENE-4674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13549542#comment-13549542 ] Shai Erera commented on LUCENE-4674: I recently (LUCENE-4620) moved some facets code to use BytesRef and IntsRef and found these two classes very convenient. The only thing that I found missing is a *Ref.upto. E.g., I first made the mistake {{for (int i = bytes.offset; i bytes.length; i++)}}, where the correct form is {{for (int i = bytes.offset; i bytes.length + bytes.offset; i++)}} (but then you need to do that '+' at every iteration, or extract it to a variable). I considered using BytesBuffer instead, but as long as e.g. a Payload is represented as a BytesRef, it's a waste to always ByteBuffer.wrap(BytesRef.bytes, offset, length). I used BytesRef as it was very convenient (and if we add an 'upto' index to them, that'd even be greater :)). I agree that grow() is currently risky, as it may override some data that is used by another thread (as a slice to the buffer). But that can be solved with proper documentation I think. I also agree that we should not set offset to 0. I did that, and MemoryCodec got upset :). For all practical purposes, apps should treat offset and length as final (we should not make them final though, just document it). If an app messes with them, it should better know what it's doing. Consistently set offset=0 in BytesRef.copyBytes --- Key: LUCENE-4674 URL: https://issues.apache.org/jira/browse/LUCENE-4674 Project: Lucene - Core Issue Type: Task Reporter: Adrien Grand Assignee: Adrien Grand Priority: Minor Attachments: LUCENE-4674.patch BytesRef.copyBytes(BytesRef other) has two branches: - either the destination array is large enough and it will copy bytes after offset, - or it needs to resize and in that case it will set offset = 0. I think this method should always set offset = 0 for consistency, and to avoid resizing when other.length is larger than this.bytes.length - this.offset but smaller than this.bytes.length. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4674) Consistently set offset=0 in BytesRef.copyBytes
[ https://issues.apache.org/jira/browse/LUCENE-4674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13549543#comment-13549543 ] Robert Muir commented on LUCENE-4674: - the whole class is confusing. but the problem with this proposed change is very simple: BytesRef a = new BytesRef(bigbyte, 0, 5); BytesRef b = new BytesRef(bigbyte, 5, 10); b.copy(someOtherStuff...) should *NOT* muck with a. A is unrelated to B. I think realistically we should avoid methods like append/copy alltogether as they encourage more stringbuffer-type use like this. if you want a stringbuffer-type class, it can safely support methods like this, but then it should *own the array* (make a copy). Consistently set offset=0 in BytesRef.copyBytes --- Key: LUCENE-4674 URL: https://issues.apache.org/jira/browse/LUCENE-4674 Project: Lucene - Core Issue Type: Task Reporter: Adrien Grand Assignee: Adrien Grand Priority: Minor Attachments: LUCENE-4674.patch BytesRef.copyBytes(BytesRef other) has two branches: - either the destination array is large enough and it will copy bytes after offset, - or it needs to resize and in that case it will set offset = 0. I think this method should always set offset = 0 for consistency, and to avoid resizing when other.length is larger than this.bytes.length - this.offset but smaller than this.bytes.length. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4674) Consistently set offset=0 in BytesRef.copyBytes
[ https://issues.apache.org/jira/browse/LUCENE-4674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13549544#comment-13549544 ] Adrien Grand commented on LUCENE-4674: -- bq. b.copy(someOtherStuff...) should NOT muck with a. Unfortunately a.copy(otherStuff) will modify b if otherStuff.length 5. Consistently set offset=0 in BytesRef.copyBytes --- Key: LUCENE-4674 URL: https://issues.apache.org/jira/browse/LUCENE-4674 Project: Lucene - Core Issue Type: Task Reporter: Adrien Grand Assignee: Adrien Grand Priority: Minor Attachments: LUCENE-4674.patch BytesRef.copyBytes(BytesRef other) has two branches: - either the destination array is large enough and it will copy bytes after offset, - or it needs to resize and in that case it will set offset = 0. I think this method should always set offset = 0 for consistency, and to avoid resizing when other.length is larger than this.bytes.length - this.offset but smaller than this.bytes.length. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4674) Consistently set offset=0 in BytesRef.copyBytes
[ https://issues.apache.org/jira/browse/LUCENE-4674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13549554#comment-13549554 ] Robert Muir commented on LUCENE-4674: - I will open a new issue to remove all write methods from bytesref. this is a ref class, not a stringbuilder. we have to keep these apis contained. Consistently set offset=0 in BytesRef.copyBytes --- Key: LUCENE-4674 URL: https://issues.apache.org/jira/browse/LUCENE-4674 Project: Lucene - Core Issue Type: Task Reporter: Adrien Grand Assignee: Adrien Grand Priority: Minor Attachments: LUCENE-4674.patch BytesRef.copyBytes(BytesRef other) has two branches: - either the destination array is large enough and it will copy bytes after offset, - or it needs to resize and in that case it will set offset = 0. I think this method should always set offset = 0 for consistency, and to avoid resizing when other.length is larger than this.bytes.length - this.offset but smaller than this.bytes.length. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4674) Consistently set offset=0 in BytesRef.copyBytes
[ https://issues.apache.org/jira/browse/LUCENE-4674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13549596#comment-13549596 ] Robert Muir commented on LUCENE-4674: - {quote} Unfortunately a.copy(otherStuff) will modify b if otherStuff.length 5. {quote} I still like the idea of fixing this myself (maybe Shai's idea?). i don't like this kind of dangerous stuff!! I ultimately think LUCENE-4675 is the next logical step, but can we remove this a.copy()-overwrites-b trap as an incremental improvement? thats a bug in my opinion. Consistently set offset=0 in BytesRef.copyBytes --- Key: LUCENE-4674 URL: https://issues.apache.org/jira/browse/LUCENE-4674 Project: Lucene - Core Issue Type: Task Reporter: Adrien Grand Assignee: Adrien Grand Priority: Minor Attachments: LUCENE-4674.patch BytesRef.copyBytes(BytesRef other) has two branches: - either the destination array is large enough and it will copy bytes after offset, - or it needs to resize and in that case it will set offset = 0. I think this method should always set offset = 0 for consistency, and to avoid resizing when other.length is larger than this.bytes.length - this.offset but smaller than this.bytes.length. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4674) Consistently set offset=0 in BytesRef.copyBytes
[ https://issues.apache.org/jira/browse/LUCENE-4674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13549625#comment-13549625 ] Adrien Grand commented on LUCENE-4674: -- bq. I still like the idea of fixing this myself (maybe Shai's idea?). i don't like this kind of dangerous stuff!! The 'upto' idea or allocating a new byte[] if someOtherStuff offset + length this.offset + length? ? bq. I ultimately think LUCENE-4675 is the next logical step, but can we remove this a.copy()-overwrites-b trap as an incremental improvement? Regarding the idea to switch to the java.nio buffers, are there some traps besides backward compatibility? Should we start migrating our internal APIs to this API (and maybe even the public ones for 5.0?). Consistently set offset=0 in BytesRef.copyBytes --- Key: LUCENE-4674 URL: https://issues.apache.org/jira/browse/LUCENE-4674 Project: Lucene - Core Issue Type: Task Reporter: Adrien Grand Assignee: Adrien Grand Priority: Minor Attachments: LUCENE-4674.patch BytesRef.copyBytes(BytesRef other) has two branches: - either the destination array is large enough and it will copy bytes after offset, - or it needs to resize and in that case it will set offset = 0. I think this method should always set offset = 0 for consistency, and to avoid resizing when other.length is larger than this.bytes.length - this.offset but smaller than this.bytes.length. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4674) Consistently set offset=0 in BytesRef.copyBytes
[ https://issues.apache.org/jira/browse/LUCENE-4674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13549630#comment-13549630 ] Robert Muir commented on LUCENE-4674: - {quote} allocating a new byte[] if someOtherStuff offset + length this.offset + length? ? {quote} This, preventing a.copy(otherStuff) from overflowing onto b. I dont want any other functionality in this class. it needs less, not more. {quote} Regarding the idea to switch to the java.nio buffers, are there some traps besides backward compatibility? Should we start migrating our internal APIs to this API (and maybe even the public ones for 5.0?). {quote} I haven't even thought about it really. I actually am less concerned about our internal apis. Its the public ones i care about. I would care a lot less about BytesRef co if users werent forced to interact with them. Consistently set offset=0 in BytesRef.copyBytes --- Key: LUCENE-4674 URL: https://issues.apache.org/jira/browse/LUCENE-4674 Project: Lucene - Core Issue Type: Task Reporter: Adrien Grand Assignee: Adrien Grand Priority: Minor Attachments: LUCENE-4674.patch BytesRef.copyBytes(BytesRef other) has two branches: - either the destination array is large enough and it will copy bytes after offset, - or it needs to resize and in that case it will set offset = 0. I think this method should always set offset = 0 for consistency, and to avoid resizing when other.length is larger than this.bytes.length - this.offset but smaller than this.bytes.length. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org