Oops, forgot to add the R code.
On 10/13/2009 03:42 PM, Romain Francois wrote:
Hello, Here is a more complete patch implementing an iterating scheme inspired from the java Iterable/Iterator design. Nothing is changed for non S4 objects. The patch contains 4 generic functions : is.iterable : indicates if an object is iterable. The default method returns FALSE iterator: returns the iterator associated with an iterable object. the idea is that this function is only called if the object is iterable. hasNext: indicates if an iterator has more elements. The default method returns FALSE getNext: returns the next element of the iterator Here is an example: require( methods ) setClass( "SimpleIterable", representation( to="integer" ) ) setClass("SimpleIterator", representation(to="integer", env = "environment") ) setMethod("is.iterable", "SimpleIterable", function(x) TRUE ) setMethod("iterator", "SimpleIterable", function(x){ env <- new.env() assign( "i" , 0L, envir = env ) new( "SimpleIterator", to = x...@to, env = env ) } ) setMethod( "hasNext", "SimpleIterator", function(x){ res <- x...@env[["i"]] < x...@to res } ) setMethod( "getNext", "SimpleIterator", function(x){ new.i <- x...@env[["i"]] + 1L assign( "i", new.i, envir = x...@env ) new.i } ) > o <- new( "SimpleIterable", to = 10L ) > for( i in o ){ + if( i == 3L ) next + if( i == 5L ) break + cat( i, "\n" ) + } 1 2 4 Here is an example iterating over a java Iterable object (the methods would need a bit more error trapping) without fetching all the elements in advance: require( rJava ) .jinit() setMethod( "is.iterable", "jobjRef", function(x){ .jinherits( x, "java/lang/Iterable" ) } ) setMethod( "iterator", "jobjRef", function(x){ .jcall( x, "Ljava/util/Iterator;", "iterator" ) } ) setMethod( "hasNext", "jobjRef", function(x){ .jcall( x, "Z", "hasNext" ) } ) setMethod( "getNext", "jobjRef", function(x){ .jcall( x, "Ljava/lang/Object;", "next" ) } ) > v <- new( J("java/util/Vector" ) ) > v$add( new( J("java/lang/Double" ), 10.2 ) ) [1] TRUE > v$add( new( J("java/awt/Point"), 10L, 10L ) ) [1] TRUE > for( i in v){ + print( i$getClass()$getName() ) + } [1] "java.lang.Double" [1] "java.awt.Point" While I'm on this, in the usual for loop : - why is the switch inside the for. The code would be slighly more efficient if it was the other way ? - why so many calls to TYPEOF, is it not always going to return the same type ? - why recreating v each time ? (I probably miss something here) Romain On 10/13/2009 11:09 AM, Romain Francois wrote:Hello, Consider this :setClass("track", representation(x="numeric", y="numeric"))[1] "track"o <- new( "track", x = 1, y = 2 ) for( i in o ){+ cat( "hello\n") + } Error: invalid type/length (S4/1) in vector allocation This happens at those lines of do_for: n = LENGTH(val); PROTECT_WITH_INDEX(v = allocVector(TYPEOF(val), 1), &vpi); because allocVector( S4SXP, 1) does not make sense. What about detecting S4SXP and attempt to call as.list, similarly to what lapply does ?as.list.track <- function(x, ...){ list( x = x...@x, y = x...@y ) } lapply( o, identity )$x [1] 1 $y [1] 2 That would make for loops more generic, and make it possible to implement custom "iterators". I'm attaching a patch to eval.c that does just that. For example : > setClass("iterator", representation(to="integer")) [1] "iterator" > o <- new( "iterator", to = 4L ) > setGeneric( "as.list" ) [1] "as.list" > setMethod( "as.list", "iterator", function(x, ...) { + seq_len( x...@to ) + }) [1] "as.list" > > for( i in o ){ + cat( i, "\n" ) + } 1 2 3 4 Obviously that is the cheap way of doing it, something better would be to abstract a bit more what is an "iterator". For example in java iterators implement just two methods : hasNext() that indicates if there is a next object and next() that returns the next object. For the long story, one motivation for this is actually to deal with java iterators (with the devel version of rJava, and this patch), you might do something like this: > v <- new( J("java/util/Vector") ) > v$add( new( J("java/awt/Point"), 10L, 10L ) ) [1] TRUE > v$add( new( J("java/lang/Double"), 10 ) ) [1] TRUE > for( item in v ){ + print( item$getClass()$getName() ) + } [1] "java.awt.Point" [1] "java.lang.Double" Where the as.list method for java object references returns a list that is filled by iterating over the object if it implements the Iterable interface. The drawback here is that one has to first fully retrieve the list, by iterating in java, and then process it in R, by iterating again in R. Romain
-- Romain Francois Professional R Enthusiast +33(0) 6 28 91 30 30 http://romainfrancois.blog.free.fr |- http://tr.im/BcPw : celebrating R commit #50000 |- http://tr.im/ztCu : RGG #158:161: examples of package IDPmisc `- http://tr.im/yw8E : New R package : sos
Index: src/library/base/R/iterators.R =================================================================== --- src/library/base/R/iterators.R (revision 0) +++ src/library/base/R/iterators.R (revision 0) @@ -0,0 +1,36 @@ +# File src/library/base/R/iterators.R +# Part of the R package, http://www.R-project.org +# +# This program is free software; you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation; either version 2 of the License, or +# (at your option) any later version. +# +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# A copy of the GNU General Public License is available at +# http://www.r-project.org/Licenses/ + +is.iterable <- function( x ){ + UseMethod("is.iterable") +} +is.iterable.default <- function(x) FALSE + +iterator <- function( x ){ + UseMethod( "iterator" ) +} +iterator.default <- function( x ) NULL + +hasNext <- function( x ){ + UseMethod("hasNext") +} +hasNext.default <- function(x) FALSE + +getNext <- function( x ){ + UseMethod( "getNext" ) +} +getNext.default <- function( x ) NULL + Index: src/main/eval.c =================================================================== --- src/main/eval.c (revision 50053) +++ src/main/eval.c (working copy) @@ -1025,15 +1025,67 @@ do_browser(call, op, R_NilValue, rho); \ } } while (0) +Rboolean isIterable( SEXP object, SEXP rho){ + SEXP expr ; + SEXP ans ; + Rboolean res ; + /* this probably should use DispatchOrEval */ + PROTECT( expr = lang2(install("is.iterable"), object )); + PROTECT( ans = eval( expr, rho ) ) ; + res = asLogicalNoNA(ans, expr) ; + UNPROTECT(2) ; + return res ; +} +Rboolean hasNext( SEXP object, SEXP rho){ + SEXP expr ; + SEXP ans ; + Rboolean res ; + /* this probably should use DispatchOrEval */ + PROTECT( expr = lang2(install("hasNext"), object )); + PROTECT( ans = eval( expr, rho ) ) ; + res = asLogicalNoNA(ans, expr) ; + UNPROTECT(2) ; + return res ; +} + +SEXP getNext( SEXP object, SEXP rho){ + SEXP expr, ans ; + /* this probably should use DispatchOrEval */ + PROTECT( expr = lang2(install("getNext"), object )); + PROTECT( ans =eval( expr, rho ) ) ; + UNPROTECT( 2) ; + return ans ; +} + +SEXP iterator(SEXP object, SEXP rho){ + SEXP expr ; + SEXP ans ; + /* this probably should use DispatchOrEval */ + PROTECT( expr = lang2(install("iterator"), object )); + PROTECT( ans = eval( expr, rho ) ) ; + UNPROTECT(2) ; + return ans ; +} + +SEXP asList( SEXP object, SEXP rho ){ + SEXP expr, ans ; + /* this probably should use DispatchOrEval */ + PROTECT( expr = lang2(install("as.list"), object )); + PROTECT( ans =eval( expr, rho ) ) ; + UNPROTECT( 2) ; + return ans ; +} + + SEXP attribute_hidden do_for(SEXP call, SEXP op, SEXP args, SEXP rho) { int dbg, nm; volatile int i, n, bgn; SEXP sym, body; - volatile SEXP ans, v, val; + volatile SEXP ans, v, val ; RCNTXT cntxt; - PROTECT_INDEX vpi, api; + PROTECT_INDEX vpi, api ; sym = CAR(args); val = CADR(args); @@ -1046,24 +1098,52 @@ PROTECT(val = eval(val, rho)); defineVar(sym, R_NilValue, rho); - /* deal with the case where we are iterating over a factor - we need to coerce to character - then iterate */ - - if( inherits(val, "factor") ) { - PROTECT(ans = asCharacterFactor(val)); - val = ans; - UNPROTECT(2); /* ans and val from above */ - PROTECT(val); + Rboolean iterate = FALSE ; + + /* deal with the S4 case, try to dispatch to as.list */ + if( TYPEOF(val) == S4SXP ){ + SEXP expr ; + + if( isIterable(val, rho) ){ + /* iterating using the iterator scheme */ + iterate = TRUE ; + + /* replace val by its iterator */ + PROTECT(ans = iterator( val, rho ) ) ; + val = ans; + UNPROTECT(2) ; /* ans and val */ + PROTECT(val ); + + PROTECT_WITH_INDEX(v = R_NilValue, &vpi); + } else{ + /* trying as.list. Maybe this should just throw an error */ + PROTECT(ans = asList( val, rho ) ) ; + val = ans ; + UNPROTECT(2); /* ans and val from above*/ + PROTECT(val); + } } - - if (isList(val) || isNull(val)) { - n = length(val); - PROTECT_WITH_INDEX(v = R_NilValue, &vpi); - } - else { - n = LENGTH(val); - PROTECT_WITH_INDEX(v = allocVector(TYPEOF(val), 1), &vpi); - } + + if( !iterate ){ + /* deal with the case where we are iterating over a factor + we need to coerce to character - then iterate */ + + if( inherits(val, "factor") ) { + PROTECT(ans = asCharacterFactor(val)); + val = ans; + UNPROTECT(2); /* ans and val from above */ + PROTECT(val); + } + + if (isList(val) || isNull(val)) { + n = length(val); + PROTECT_WITH_INDEX(v = R_NilValue, &vpi); + } + else { + n = LENGTH(val); + PROTECT_WITH_INDEX(v = allocVector(TYPEOF(val), 1), &vpi); + } + } ans = R_NilValue; dbg = RDEBUG(rho); @@ -1079,62 +1159,84 @@ PROTECT_WITH_INDEX(ans, &api); /**** ans should no longer be needed. LT */ begincontext(&cntxt, CTXT_LOOP, R_NilValue, rho, R_BaseEnv, R_NilValue, R_NilValue); - switch (SETJMP(cntxt.cjmpbuf)) { - case CTXT_BREAK: goto for_break; - case CTXT_NEXT: goto for_next; - } - for (i = 0; i < n; i++) { - DO_LOOP_RDEBUG(call, op, args, rho, bgn); - switch (TYPEOF(val)) { - case LGLSXP: - REPROTECT(v = allocVector(TYPEOF(val), 1), vpi); - LOGICAL(v)[0] = LOGICAL(val)[i]; - setVar(sym, v, rho); - break; - case INTSXP: - REPROTECT(v = allocVector(TYPEOF(val), 1), vpi); - INTEGER(v)[0] = INTEGER(val)[i]; - setVar(sym, v, rho); - break; - case REALSXP: - REPROTECT(v = allocVector(TYPEOF(val), 1), vpi); - REAL(v)[0] = REAL(val)[i]; - setVar(sym, v, rho); - break; - case CPLXSXP: - REPROTECT(v = allocVector(TYPEOF(val), 1), vpi); - COMPLEX(v)[0] = COMPLEX(val)[i]; - setVar(sym, v, rho); - break; - case STRSXP: - REPROTECT(v = allocVector(TYPEOF(val), 1), vpi); - SET_STRING_ELT(v, 0, STRING_ELT(val, i)); - setVar(sym, v, rho); - break; - case RAWSXP: - REPROTECT(v = allocVector(TYPEOF(val), 1), vpi); - RAW(v)[0] = RAW(val)[i]; - setVar(sym, v, rho); - break; - case EXPRSXP: - case VECSXP: - /* make sure loop variable is a copy if needed */ - if(nm > 0) SET_NAMED(VECTOR_ELT(val, i), 2); - setVar(sym, VECTOR_ELT(val, i), rho); - break; - case LISTSXP: - /* make sure loop variable is a copy if needed */ - if(nm > 0) SET_NAMED(CAR(val), 2); - setVar(sym, CAR(val), rho); - val = CDR(val); - break; - default: - errorcall(call, _("invalid for() loop sequence")); - } - REPROTECT(ans = eval(body, rho), api); - for_next: - ; /* needed for strict ISO C compliance, according to gcc 2.95.2 */ - } + if( iterate ){ + switch (SETJMP(cntxt.cjmpbuf)) { + case CTXT_BREAK: goto for_break; + case CTXT_NEXT: goto iterate_next; + } + + while( hasNext( val , rho ) == TRUE ){ + DO_LOOP_RDEBUG(call, op, args, rho, bgn); + + /* get the next item and set it to the loop symbol */ + REPROTECT(v = getNext(val, rho) , vpi); + setVar(sym, v, rho); + + /* eval the loop body */ + REPROTECT(ans = eval(body, rho), api); + + iterate_next: + ; /* needed for strict ISO C compliance, according to gcc 2.95.2 */ + } + } else{ + + switch (SETJMP(cntxt.cjmpbuf)) { + case CTXT_BREAK: goto for_break; + case CTXT_NEXT: goto for_next; + } + for (i = 0; i < n; i++) { + DO_LOOP_RDEBUG(call, op, args, rho, bgn); + switch (TYPEOF(val)) { + case LGLSXP: + REPROTECT(v = allocVector(TYPEOF(val), 1), vpi); + LOGICAL(v)[0] = LOGICAL(val)[i]; + setVar(sym, v, rho); + break; + case INTSXP: + REPROTECT(v = allocVector(TYPEOF(val), 1), vpi); + INTEGER(v)[0] = INTEGER(val)[i]; + setVar(sym, v, rho); + break; + case REALSXP: + REPROTECT(v = allocVector(TYPEOF(val), 1), vpi); + REAL(v)[0] = REAL(val)[i]; + setVar(sym, v, rho); + break; + case CPLXSXP: + REPROTECT(v = allocVector(TYPEOF(val), 1), vpi); + COMPLEX(v)[0] = COMPLEX(val)[i]; + setVar(sym, v, rho); + break; + case STRSXP: + REPROTECT(v = allocVector(TYPEOF(val), 1), vpi); + SET_STRING_ELT(v, 0, STRING_ELT(val, i)); + setVar(sym, v, rho); + break; + case RAWSXP: + REPROTECT(v = allocVector(TYPEOF(val), 1), vpi); + RAW(v)[0] = RAW(val)[i]; + setVar(sym, v, rho); + break; + case EXPRSXP: + case VECSXP: + /* make sure loop variable is a copy if needed */ + if(nm > 0) SET_NAMED(VECTOR_ELT(val, i), 2); + setVar(sym, VECTOR_ELT(val, i), rho); + break; + case LISTSXP: + /* make sure loop variable is a copy if needed */ + if(nm > 0) SET_NAMED(CAR(val), 2); + setVar(sym, CAR(val), rho); + val = CDR(val); + break; + default: + errorcall(call, _("invalid for() loop sequence")); + } + REPROTECT(ans = eval(body, rho), api); + for_next: + ; /* needed for strict ISO C compliance, according to gcc 2.95.2 */ + } + } for_break: endcontext(&cntxt); UNPROTECT(5);
______________________________________________ [email protected] mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
