On 05/20/2015 08:53 AM, Trejkaz wrote:
On Wed, May 20, 2015 at 3:21 PM, Olivier Binda <olivier.bi...@wanadoo.fr> wrote:
My take :
Indeed BytesRef is mutable
This happens for performance reasons, to avoid unnecessary object
creations and unecessary copying and Also to workaround
the java "issue" that most of the time  you need to pass an array with an
offset and length in methods for performance but you don't want to create
an array every time you have to do that

In your case, you are supposed to copy your bytes because, indeed, the
bytesRef will change everytime you call a lucene method on it
(it is mutable) and the array it points to will change too because these
might be internal arrays of readers/buffers/codecs
(and you don't know the internal working of those)...
That's fair enough, most of this is a philosophical issue anyway. Some
people prefer reusing objects and overwriting data because they don't
trust GC or whatever. I prefer immutable objects because at least then
when you have an object you can guarantee nobody else can mess with
it.

But that aside, it's still astonishing when the method to clone an
object doesn't actually clone it. There isn't any other obvious method
on BytesRef to perform a copy, either. What are we supposed to do,
pull out the byte array, offset and length manually and then jam it
into another BytesRef? Ew.
If you want immutable data, you have to create a new byte array and copy the bytes in there

Lucene gives the user of the library the choice of how to use the data (which is good) instead of creating immutable data for everybody and to make people who don't need it suffer the penalty


There are other places in Lucene that are designed with performance as a goal
and that may not behave as one would superficially think :

For example, I realized recently that the binaryDocValues that I was sharing in a hashmap were not thread safe (it's written in the doc but...sometimes...after a while you use something whose details you have forgotten) and I had to take mesure to use the clone method they have to use them as they were designed to be


Lucene is a library that you have to understand, in order not to shoot yourself in the foot with... like persisting docIds is 99.99% of the time a very bad idea (there are warnings)

Paying attenton to the docs helps a lot


TX

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org




---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to