[jira] [Commented] (LUCENE-4675) remove *Ref.copy/append/grow

2013-01-10 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13549654#comment-13549654
 ] 

Uwe Schindler commented on LUCENE-4675:
---

Strong +1 to make BytesRef a byte[] reference only. BytesRef is unfortunately a 
user-facing class in Lucene 4.x, so we have to look into this. I was also 
planning to fix this before 4.0, but we had no time. This was one of the last 
classes, Robert and I did not fix in the final cleanup before release, which is 
a pity.

> remove *Ref.copy/append/grow
> 
>
> Key: LUCENE-4675
> URL: https://issues.apache.org/jira/browse/LUCENE-4675
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
>
> These methods are dangerous:
> In general if we want a StringBuilder type class, then it should own the 
> array, and it can freely do allocation stuff etc. this is the only way to 
> make it safe.
> Otherwise if we want a ByteBuffer type class, then its reference should be 
> immutable (the byte[]/offset/length should be final), and it should not have 
> allocation stuff.
> BytesRef is none of these, its like a C pointer. Unfortunately lucene puts 
> these unsafe, dangerous, trappy APIs directly in front of the user.
> What happens if i have a bug in my application and it accidentally mucks with 
> the term bytes returned by TermsEnum or the payloads from 
> DocsAndPositionsEnum? Will this get merged into a corrupt index?
> I think as a start we should remove these copy/append/grow to minimize this 
> closer to a ref class (e.g. more like java.lang.ref and less like 
> stringbuilder). Nobody needs this stuff on bytesref, they can already operate 
> on the bytes directly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4675) remove *Ref.copy/append/grow

2013-01-10 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13549637#comment-13549637
 ] 

Robert Muir commented on LUCENE-4675:
-

I dont think we should add more functionality to these *Ref classes: they have 
too many traps and bugs already.

Less is more here.

> remove *Ref.copy/append/grow
> 
>
> Key: LUCENE-4675
> URL: https://issues.apache.org/jira/browse/LUCENE-4675
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
>
> These methods are dangerous:
> In general if we want a StringBuilder type class, then it should own the 
> array, and it can freely do allocation stuff etc. this is the only way to 
> make it safe.
> Otherwise if we want a ByteBuffer type class, then its reference should be 
> immutable (the byte[]/offset/length should be final), and it should not have 
> allocation stuff.
> BytesRef is none of these, its like a C pointer. Unfortunately lucene puts 
> these unsafe, dangerous, trappy APIs directly in front of the user.
> What happens if i have a bug in my application and it accidentally mucks with 
> the term bytes returned by TermsEnum or the payloads from 
> DocsAndPositionsEnum? Will this get merged into a corrupt index?
> I think as a start we should remove these copy/append/grow to minimize this 
> closer to a ref class (e.g. more like java.lang.ref and less like 
> stringbuilder). Nobody needs this stuff on bytesref, they can already operate 
> on the bytes directly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4675) remove *Ref.copy/append/grow

2013-01-10 Thread Shai Erera (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13549632#comment-13549632
 ] 

Shai Erera commented on LUCENE-4675:


bq. you can separately make your own BytesRefIterator class

I can. I wanted to avoid additional object allocations, but such an Iterator 
class can have a reset(BytesRef) method which will update pos and upto members 
accordingly. I was thinking that an 'upto' index might be useful for others. 
For my purposes (see LUCENE-4620) I just use bytes.offset as 'pos' and compute 
an 'upto' and passes it along. I will think about the Iterator class though, 
perhaps it's not a bad idea. And maybe *Ref can have an iterator() method which 
returns the proper one ... or not.

> remove *Ref.copy/append/grow
> 
>
> Key: LUCENE-4675
> URL: https://issues.apache.org/jira/browse/LUCENE-4675
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
>
> These methods are dangerous:
> In general if we want a StringBuilder type class, then it should own the 
> array, and it can freely do allocation stuff etc. this is the only way to 
> make it safe.
> Otherwise if we want a ByteBuffer type class, then its reference should be 
> immutable (the byte[]/offset/length should be final), and it should not have 
> allocation stuff.
> BytesRef is none of these, its like a C pointer. Unfortunately lucene puts 
> these unsafe, dangerous, trappy APIs directly in front of the user.
> What happens if i have a bug in my application and it accidentally mucks with 
> the term bytes returned by TermsEnum or the payloads from 
> DocsAndPositionsEnum? Will this get merged into a corrupt index?
> I think as a start we should remove these copy/append/grow to minimize this 
> closer to a ref class (e.g. more like java.lang.ref and less like 
> stringbuilder). Nobody needs this stuff on bytesref, they can already operate 
> on the bytes directly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4675) remove *Ref.copy/append/grow

2013-01-10 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13549611#comment-13549611
 ] 

Robert Muir commented on LUCENE-4675:
-

i dont think we need any additional members in this thing. what more does it 
need other than byte[], offset, length?!

i want to remove the extraneous stuff. if you want to make an iterator, you can 
separately make your own BytesRefIterator class?

> remove *Ref.copy/append/grow
> 
>
> Key: LUCENE-4675
> URL: https://issues.apache.org/jira/browse/LUCENE-4675
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
>
> These methods are dangerous:
> In general if we want a StringBuilder type class, then it should own the 
> array, and it can freely do allocation stuff etc. this is the only way to 
> make it safe.
> Otherwise if we want a ByteBuffer type class, then its reference should be 
> immutable (the byte[]/offset/length should be final), and it should not have 
> allocation stuff.
> BytesRef is none of these, its like a C pointer. Unfortunately lucene puts 
> these unsafe, dangerous, trappy APIs directly in front of the user.
> What happens if i have a bug in my application and it accidentally mucks with 
> the term bytes returned by TermsEnum or the payloads from 
> DocsAndPositionsEnum? Will this get merged into a corrupt index?
> I think as a start we should remove these copy/append/grow to minimize this 
> closer to a ref class (e.g. more like java.lang.ref and less like 
> stringbuilder). Nobody needs this stuff on bytesref, they can already operate 
> on the bytes directly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4675) remove *Ref.copy/append/grow

2013-01-10 Thread Shai Erera (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13549598#comment-13549598
 ] 

Shai Erera commented on LUCENE-4675:


ok. While you're at it, what do you think about adding an 'upto' member for 
easier iteration on the bytes/ints/chars? (see my comment on LUCENE-4674)

> remove *Ref.copy/append/grow
> 
>
> Key: LUCENE-4675
> URL: https://issues.apache.org/jira/browse/LUCENE-4675
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
>
> These methods are dangerous:
> In general if we want a StringBuilder type class, then it should own the 
> array, and it can freely do allocation stuff etc. this is the only way to 
> make it safe.
> Otherwise if we want a ByteBuffer type class, then its reference should be 
> immutable (the byte[]/offset/length should be final), and it should not have 
> allocation stuff.
> BytesRef is none of these, its like a C pointer. Unfortunately lucene puts 
> these unsafe, dangerous, trappy APIs directly in front of the user.
> What happens if i have a bug in my application and it accidentally mucks with 
> the term bytes returned by TermsEnum or the payloads from 
> DocsAndPositionsEnum? Will this get merged into a corrupt index?
> I think as a start we should remove these copy/append/grow to minimize this 
> closer to a ref class (e.g. more like java.lang.ref and less like 
> stringbuilder). Nobody needs this stuff on bytesref, they can already operate 
> on the bytes directly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4675) remove *Ref.copy/append/grow

2013-01-10 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13549590#comment-13549590
 ] 

Robert Muir commented on LUCENE-4675:
-

I'm proposing removing these 3 methods from BytesRef itself, thats all.

The guy from the outside knows what he can do: he knows if the bytes actually 
point to a slice of a PagedBytes
(grow is actually senseless here!), or just a simple byte[], or whatever. He 
doesn't need BytesRef itself to do these things.

So he can then change the ref to point at a different slice, or different 
byte[] alltogether, or whatever.

> remove *Ref.copy/append/grow
> 
>
> Key: LUCENE-4675
> URL: https://issues.apache.org/jira/browse/LUCENE-4675
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
>
> These methods are dangerous:
> In general if we want a StringBuilder type class, then it should own the 
> array, and it can freely do allocation stuff etc. this is the only way to 
> make it safe.
> Otherwise if we want a ByteBuffer type class, then its reference should be 
> immutable (the byte[]/offset/length should be final), and it should not have 
> allocation stuff.
> BytesRef is none of these, its like a C pointer. Unfortunately lucene puts 
> these unsafe, dangerous, trappy APIs directly in front of the user.
> What happens if i have a bug in my application and it accidentally mucks with 
> the term bytes returned by TermsEnum or the payloads from 
> DocsAndPositionsEnum? Will this get merged into a corrupt index?
> I think as a start we should remove these copy/append/grow to minimize this 
> closer to a ref class (e.g. more like java.lang.ref and less like 
> stringbuilder). Nobody needs this stuff on bytesref, they can already operate 
> on the bytes directly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4675) remove *Ref.copy/append/grow

2013-01-10 Thread Shai Erera (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13549589#comment-13549589
 ] 

Shai Erera commented on LUCENE-4675:


I kinda like grow(). Will I be able to grow() the buffer from the outside if 
you remove it? I.e. will the byte[] not be final?

> remove *Ref.copy/append/grow
> 
>
> Key: LUCENE-4675
> URL: https://issues.apache.org/jira/browse/LUCENE-4675
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
>
> These methods are dangerous:
> In general if we want a StringBuilder type class, then it should own the 
> array, and it can freely do allocation stuff etc. this is the only way to 
> make it safe.
> Otherwise if we want a ByteBuffer type class, then its reference should be 
> immutable (the byte[]/offset/length should be final), and it should not have 
> allocation stuff.
> BytesRef is none of these, its like a C pointer. Unfortunately lucene puts 
> these unsafe, dangerous, trappy APIs directly in front of the user.
> What happens if i have a bug in my application and it accidentally mucks with 
> the term bytes returned by TermsEnum or the payloads from 
> DocsAndPositionsEnum? Will this get merged into a corrupt index?
> I think as a start we should remove these copy/append/grow to minimize this 
> closer to a ref class (e.g. more like java.lang.ref and less like 
> stringbuilder). Nobody needs this stuff on bytesref, they can already operate 
> on the bytes directly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org