+1
This is actually a bug in my opinion. The reason for this reach back to the change from pure byte[] to BytesRef. Before it was also shallow, but the general use-case was to not change the byte[] contents (so see it as final), but assign new byte[] to it – but with the change to BytesRef there is one more indirection, leading to confusion. In fact AttributeImpl.clone() should always be a deep clone, otherwise saving state and cloning attributes is incorrect. And we should fix the documentation for CTA and PA. Lucene 5 is the ideal place to fixup this behavior, so we dont break code unexpected. Uwe ----- Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen <http://www.thetaphi.de/> http://www.thetaphi.de eMail: [email protected] From: Shai Erera [mailto:[email protected]] Sent: Saturday, November 08, 2014 5:26 AM To: [email protected] Subject: Why is PayloadAttribute.clone() shallow? Hi Someone ran into this in the context of working with TeeSinkTokenFilter - when the state is cloned for each of the sinks, if you have a PayloadAttribute on the stream, it's shallow cloned and thus all sinks share the same byte[]. At first I thought this is just a bug in PA.clone(), and that it should have been implemented using deepCopyOf, but then I noticed that it's actually documented that if you require deep cloning, you should implement your own PA (or at least override .clone()). But then, CharTermAttribute.clone() documents that it does a shallow clone, but actually implements a *deep* clone. So now I think PA.clone() should indeed be implemented as a deep clone... What do you think? Shai
