On 05/20/2015 08:53 AM, Trejkaz wrote:
On Wed, May 20, 2015 at 3:21 PM, Olivier Binda <olivier.bi...@wanadoo.fr> wrote:
My take :
Indeed BytesRef is mutable
This happens for performance reasons, to avoid unnecessary object
creations and unecessary copying and Also to workaround
the java "issue" that most of the time you need to pass an array with an
offset and length in methods for performance but you don't want to create
an array every time you have to do that
In your case, you are supposed to copy your bytes because, indeed, the
bytesRef will change everytime you call a lucene method on it
(it is mutable) and the array it points to will change too because these
might be internal arrays of readers/buffers/codecs
(and you don't know the internal working of those)...
That's fair enough, most of this is a philosophical issue anyway. Some
people prefer reusing objects and overwriting data because they don't
trust GC or whatever. I prefer immutable objects because at least then
when you have an object you can guarantee nobody else can mess with
it.
But that aside, it's still astonishing when the method to clone an
object doesn't actually clone it. There isn't any other obvious method
on BytesRef to perform a copy, either. What are we supposed to do,
pull out the byte array, offset and length manually and then jam it
into another BytesRef? Ew.
If you want immutable data, you have to create a new byte array and copy
the bytes in there
Lucene gives the user of the library the choice of how to use the data
(which is good)
instead of creating immutable data for everybody and to make people
who don't need it suffer the penalty
There are other places in Lucene that are designed with performance as a
goal
and that may not behave as one would superficially think :
For example, I realized recently that the binaryDocValues that I was
sharing in a hashmap were not thread safe (it's written in the doc
but...sometimes...after a while you use something whose details you have
forgotten) and I had to take mesure to use the clone method they have to
use them as they were designed to be
Lucene is a library that you have to understand, in order not to shoot
yourself in the foot with...
like persisting docIds is 99.99% of the time a very bad idea (there are
warnings)
Paying attenton to the docs helps a lot
TX
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org