On 1/7/13 9:59 AM, Douglas Bates wrote:
Is there a difference in the copying behavior of

x@little <- other

and

x@little[] <- other

Not in the direction you were hoping, as far as I can tell.

Nested replacement expressions in R and S are unraveled and done as repeated simple replacements. So either way you end up with, in effect
  x@little <- something

If x has >1 reference, as it tends to, EnsureLocal() will call duplicate().

I think the only difference is that your second form gets you to duplicate the little vector twice. ;-)

John

I was using the second form in (yet another!) modification of the internal
representation of mixed-effects models in the lme4 package in the hopes
that it would not trigger copying of the entire object.  The object
representing the model is quite large but the changes during iterations are
to small vectors representing parameters and coefficients.



On Thu, Jan 3, 2013 at 1:08 PM, John Chambers <j...@r-project.org> wrote:

Martin Morgan commented in email to me that a change to any slot of an
object that has other, large slot(s) does substantial computation,
presumably from copying the whole object.  Is there anything to be done?

There are in fact two possible changes, one automatic but only partial,
the other requiring some action on the programmer's part.  Herewith the
first; I'll discuss the second in a later email.

Some context:  The notion is that our object has some big data and some
additional smaller things.  We need to change the small things but would
rather not copy the big things all the time.  (With long vectors, this
becomes even more relevant.)

There are three likely scenarios: slots, attributes and named list
components.  Suppose our object has "little" and "BIG" encoded in one of
these.

The three relevant computations are:

x@little <- other
attr(x, "little") <- other
x$little <- other

It turns out that these are all similar in behavior with one important
exception--fixing that is the automatic change.

I need to review what R does here. All these are replacement functions,
`@<-`, `attr<-`, `$<-`.  The evaluator checks before calling any
replacement whether the object needs to be duplicated (in a routine
EnsureLocal()).  It does that by examining a special field that holds the
reference status of the object.

Some languages, such as Python (and S) keep reference counts for each
object, de-allocating the object when the reference count drops back to
zero.  R uses a different strategy. Its NAMED() field is 0, 1 or 2
according to whether the object has been assigned never, once or more than
once.  The field is not a reference count and is not decremented--relevant
for this issue.  Objects are de-allocated only when garbage collection
occurs and the object does not appear in any current frame or other context.
(I did not write any of this code, so apologies if I'm misrepresenting it.)

When any of these replacement operations first occurs for a particular
object in a particular function call, it's very likely that the reference
status will be 2 and EnsureLocal will duplicate it--all of it. Regardless
of which of the three forms is used.

Here the non-level-playing-field aspect comes in.  `@<-` is a normal R
function (a "closure") but the other two are primitives in the main code
for R.  Primitives have no frame in which arguments are stored.  As a
result the new version of x is normally stored with status 1.

If one does a second replacement in the same call (in a loop, e.g.) that
should not normally copy again.  But the result of `@<-` will be an object
from its frame and will have status 2 when saved, forcing a copy each time.

So the change, naturally, is that R 3.0.0 will have a primitive
implementation of `@<`.  This has been implemented in r-devel (rev. 61544).

Please try it out _before_ we issue that version, especially if you own a
package that does things related to this question.

John

PS:  Some may have noticed that I didn't mention a fourth approach: fields
in a reference class object.  The assumption was that we wanted classical,
functional behavior here.  Reference classes don't have the copy problem
but don't behave functionally either.  But that is in fact the direction
for the other approach.  I'll discuss that later, when the corresponding
code is available.

______________________________**________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/**listinfo/r-devel<https://stat.ethz.ch/mailman/listinfo/r-devel>


        [[alternative HTML version deleted]]

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to