On Nov 24, 2011, at 8:05 AM, Matthew Dowle wrote: >> >> On Nov 24, 2011, at 12:34 , Matthew Dowle wrote: >> >>>> >>>> On Nov 24, 2011, at 11:13 , Matthew Dowle wrote: >>>> >>>>> Hi, >>>>> >>>>> I expected NAMED to be 1 in all these three cases. It is for one of >>>>> them, >>>>> but not the other two? >>>>> >>>>>> R --vanilla >>>>> R version 2.14.0 (2011-10-31) >>>>> Platform: i386-pc-mingw32/i386 (32-bit) >>>>> >>>>>> x = 1L >>>>>> .Internal(inspect(x)) # why NAM(2)? expected NAM(1) >>>>> @2514aa0 13 INTSXP g0c1 [NAM(2)] (len=1, tl=0) 1 >>>>> >>>>>> y = 1:10 >>>>>> .Internal(inspect(y)) # NAM(1) as expected but why different to x? >>>>> @272f788 13 INTSXP g0c4 [NAM(1)] (len=10, tl=0) 1,2,3,4,5,... >>>>> >>>>>> z = data.frame() >>>>>> .Internal(inspect(z)) # why NAM(2)? expected NAM(1) >>>>> @24fc28c 19 VECSXP g0c0 [OBJ,NAM(2),ATT] (len=0, tl=0) >>>>> ATTRIB: >>>>> @24fc270 02 LISTSXP g0c0 [] >>>>> TAG: @3f2120 01 SYMSXP g0c0 [MARK,gp=0x4000] "names" >>>>> @24fc334 16 STRSXP g0c0 [] (len=0, tl=0) >>>>> TAG: @3f2040 01 SYMSXP g0c0 [MARK,gp=0x4000] "row.names" >>>>> @24fc318 13 INTSXP g0c0 [] (len=0, tl=0) >>>>> TAG: @3f2388 01 SYMSXP g0c0 [MARK,gp=0x4000] "class" >>>>> @25be500 16 STRSXP g0c1 [] (len=1, tl=0) >>>>> @1d38af0 09 CHARSXP g0c2 [MARK,gp=0x21,ATT] "data.frame" >>>>> >>>>> It's a little difficult to search for the word "named" but I tried and >>>>> found this in R-ints : >>>>> >>>>> "Note that optimizing NAMED = 1 is only effective within a primitive >>>>> (as the closure wrapper of a .Internal will set NAMED = 2 when the >>>>> promise to the argument is evaluated)" >>>>> >>>>> So might it be that just looking at NAMED using .Internal(inspect()) >>>>> is >>>>> setting NAMED=2? But if so, why does y have NAMED==1? >>>> >>>> This is tricky business... I'm not quite sure I'll get it right, but >>>> let's >>>> try >>>> >>>> When you are assigning a constant, the value you assign is already part >>>> of >>>> the assignment expression, so if you want to modify it, you must >>>> duplicate. So NAMED==2 on z <- 1 is basically to prevent you from >>>> accidentally "changing the value of 1". If it weren't, then you could >>>> get >>>> bitten by code like for(i in 1:2) {z <- 1; if(i==1) z[1] <- 2}. >>>> >>>> If you're assigning the result of a computation, then the object only >>>> exists once, so >>>> z <- 0+1 gets NAMED==1. >>>> >>>> However, if the computation is done by returning a named value from >>>> within >>>> a function, as in >>>> >>>>> f <- function(){v <- 1+0; v} >>>>> z <- f() >>>> >>>> then again NAMED==2. This is because the side effects of the function >>>> _might_ result in something having a hold on the function environment, >>>> e.g. if we had >>>> >>>> e <- NULL >>>> f <- function(){e <<-environment(); v <- 1+0; v} >>>> z <- f() >>>> >>>> then z[1] <- 5 would change e$v too. As it happens, there aren't any >>>> side >>>> effects in the forme case, but R loses track and assumes the worst. >>>> >>> >>> Thanks a lot, think I follow. That explains x vs y, but why is z >>> NAMED==2? >>> The result of data.frame() is an object that exists once (similar to >>> 1:10) >>> so shouldn't it be NAMED==1 too? Or, R loses track and assumes the >>> worst >>> even on its own functions such as data.frame()? >> >> R loses track. I suspect that is really all it can do without actual >> reference counting. The function data.frame is more than 150 lines of >> code, and if any of those end up invoking user code, possibly via a class >> method, you can't tell definitively whether or not the evaluation >> environment dies at the return. > > Ohhh, think I see now. After Duncan's reply I was going to ask if it was > possible to change data.frame() to be primitive so it could set NAMED=1. > But it seems primitive functions can't use R code so data.frame() would > need to be ported to C. Ok! - not quick or easy, and not without > consideable risk. And, data.frame() can invoke user code inside it anyway > then. > > Since list() is primitive I tried to construct a data.frame starting with > list() [since structure() isn't primitive], but then merely adding an > attribute seems to set NAMED==2 too ? >
Yes, because attr(x,y) <- z is the same as `*tmp*` <- x x <- `attr<-`(`*tmp*`, y, z) rm(`*tmp*`) so there are two references to the data frame: one in DF and one in `*tmp*`. It is the first line that causes the NAMED bump. And, yes, it's real: > `f<-`=function(x,value) { print(ls(parent.frame())); x<-value } > x=1 > f(x)=1 [1] "*tmp*" "f<-" "x" You could skip that by using the function directly (I don't think it's recommended, though): > .Internal(inspect(l <- list(a=1))) @1028c82f8 19 VECSXP g0c1 [NAM(1),ATT] (len=1, tl=0) @1028c8268 14 REALSXP g0c1 [] (len=1, tl=0) 1 ATTRIB: @100b6e748 02 LISTSXP g0c0 [] TAG: @100843878 01 SYMSXP g0c0 [MARK,gp=0x4000] "names" @1028c82c8 16 STRSXP g0c1 [] (len=1, tl=0) @1009cd388 09 CHARSXP g0c1 [MARK,gp=0x21] "a" > .Internal(inspect(`names<-`(l, "b"))) @1028c82f8 19 VECSXP g0c1 [NAM(1),ATT] (len=1, tl=0) @1028c8268 14 REALSXP g0c1 [] (len=1, tl=0) 1 ATTRIB: @100b6e748 02 LISTSXP g0c0 [] TAG: @100843878 01 SYMSXP g0c0 [MARK,gp=0x4000] "names" @1028c8178 16 STRSXP g0c1 [NAM(1)] (len=1, tl=0) @100967af8 09 CHARSXP g0c1 [MARK,gp=0x20] "b" > .Internal(inspect(l)) @1028c82f8 19 VECSXP g0c1 [NAM(1),ATT] (len=1, tl=0) @1028c8268 14 REALSXP g0c1 [] (len=1, tl=0) 1 ATTRIB: @100b6e748 02 LISTSXP g0c0 [] TAG: @100843878 01 SYMSXP g0c0 [MARK,gp=0x4000] "names" @1028c8178 16 STRSXP g0c1 [NAM(1)] (len=1, tl=0) @100967af8 09 CHARSXP g0c1 [MARK,gp=0x20] "b" Cheers, Simon >> DF = list(a=1:3,b=4:6) >> .Internal(inspect(DF)) # so far so good: NAM(1) > @25149e0 19 VECSXP g0c1 [NAM(1),ATT] (len=2, tl=0) > @263ea50 13 INTSXP g0c2 [] (len=3, tl=0) 1,2,3 > @263eaa0 13 INTSXP g0c2 [] (len=3, tl=0) 4,5,6 > ATTRIB: > @2457984 02 LISTSXP g0c0 [] > TAG: @3f2120 01 SYMSXP g0c0 [MARK,gp=0x4000] "names" > @25149c0 16 STRSXP g0c1 [] (len=2, tl=0) > @1e987d8 09 CHARSXP g0c1 [MARK,gp=0x21] "a" > @1e56948 09 CHARSXP g0c1 [MARK,gp=0x21] "b" >> >> attr(DF,"foo") <- "bar" # just adding an attribute sets NAM(2) ? >> .Internal(inspect(DF)) > @25149e0 19 VECSXP g0c1 [NAM(2),ATT] (len=2, tl=0) > @263ea50 13 INTSXP g0c2 [] (len=3, tl=0) 1,2,3 > @263eaa0 13 INTSXP g0c2 [] (len=3, tl=0) 4,5,6 > ATTRIB: > @2457984 02 LISTSXP g0c0 [] > TAG: @3f2120 01 SYMSXP g0c0 [MARK,gp=0x4000] "names" > @25149c0 16 STRSXP g0c1 [] (len=2, tl=0) > @1e987d8 09 CHARSXP g0c1 [MARK,gp=0x21] "a" > @1e56948 09 CHARSXP g0c1 [MARK,gp=0x21] "b" > TAG: @245732c 01 SYMSXP g0c0 [] "foo" > @25148a0 16 STRSXP g0c1 [NAM(1)] (len=1, tl=0) > @2514920 09 CHARSXP g0c1 [gp=0x20] "bar" > > > Matthew > > >> -- >> Peter Dalgaard, Professor >> Center for Statistics, Copenhagen Business School >> Solbjerg Plads 3, 2000 Frederiksberg, Denmark >> Phone: (+45)38153501 >> Email: pd....@cbs.dk Priv: pda...@gmail.com >> >> > > ______________________________________________ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel > > ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel