+1

 

This is actually a bug in my opinion. The reason for this reach back to the 
change from pure byte[] to BytesRef. Before it was also shallow, but the 
general use-case was to not change the byte[] contents (so see it as final), 
but assign new byte[] to it – but with the change to BytesRef there is one more 
indirection, leading to confusion. In fact AttributeImpl.clone() should always 
be a deep clone, otherwise saving state and cloning attributes is incorrect.

 

And we should fix the documentation for CTA and PA. Lucene 5 is the ideal place 
to fixup this behavior, so we dont break code unexpected.

 

Uwe

 

-----

Uwe Schindler

H.-H.-Meier-Allee 63, D-28213 Bremen

 <http://www.thetaphi.de/> http://www.thetaphi.de

eMail: [email protected]

 

From: Shai Erera [mailto:[email protected]] 
Sent: Saturday, November 08, 2014 5:26 AM
To: [email protected]
Subject: Why is PayloadAttribute.clone() shallow?

 

Hi

Someone ran into this in the context of working with TeeSinkTokenFilter - when 
the state is cloned for each of the sinks, if you have a PayloadAttribute on 
the stream, it's shallow cloned and thus all sinks share the same byte[].

At first I thought this is just a bug in PA.clone(), and that it should have 
been implemented using deepCopyOf, but then I noticed that it's actually 
documented that if you require deep cloning, you should implement your own PA 
(or at least override .clone()).

But then, CharTermAttribute.clone() documents that it does a shallow clone, but 
actually implements a *deep* clone.

So now I think PA.clone() should indeed be implemented as a deep clone...

What do you think?

Shai

Reply via email to