Re: [R] The end of Matlab
Duncan Murdoch wrote: On 11/12/2008 9:45 PM, Mike Rowe wrote: Greetings! I come to R by way of Matlab. One feature in Matlab I miss is its end keyword. When you put end inside an indexing expression, it is interpreted as the length of the variable along the dimension being indexed. For example, if the same feature were implemented in R: my.vector[5:end] would be equivalent to: my.vector[5:length(my.vector)] And if my.vector is of length less than 5? or: this.matrix[3:end,end] would be equivalent to: this.matrix[3:nrow(this.matrix),ncol(this.matrix)] # or this.matrix[3:dim(this.matrix)[1],dim(this.matrix)[2]] As you can see, the R version requires more typing, and I am a lousy typist. It doesn't save typing, but a more readable version would be rows - nrow(this.matrix) cols - ncol(this.matrix) this.matrix[3:rows, cols] and if nrow(this.matrix) is less than 3? With this in mind, I wanted to try to implement something like this in R. It seems like that in order to be able to do this, I would have to be able to access the parse tree of the expression currently being evaluated by the interpreter from within my End function-- is this possible? Since the [ and [[ operators are primitive I can't see their arguments via the call stack functions... Anyone got a workaround? Would anybody else like to see this feature added to R? I like the general rule that subexpressions have values that can be evaluated independent of context, so I don't think this is a good idea. but this 'general rule' is not really adhered to in r! one example already discussed here at length is subset: subset(data.frame(...), select=a) what will be selected? column named a, or columns named by the components of the vector a? this is an example of how you can't say what an expression means in a context-independent manner. and this is an ubiquitous problem in r. vQ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] The end of Matlab
On Fri, 12 Dec 2008, Duncan Murdoch wrote: On 11/12/2008 9:45 PM, Mike Rowe wrote: Greetings! I come to R by way of Matlab. One feature in Matlab I miss is its end keyword. When you put end inside an indexing expression, it is interpreted as the length of the variable along the dimension being indexed. For example, if the same feature were implemented in R: my.vector[5:end] would be equivalent to: my.vector[5:length(my.vector)] And if my.vector is of length less than 5? Also consider my.vector[-(1:4)] or: this.matrix[3:end,end] would be equivalent to: this.matrix[3:nrow(this.matrix),ncol(this.matrix)] # or this.matrix[3:dim(this.matrix)[1],dim(this.matrix)[2]] As you can see, the R version requires more typing, and I am a lousy typist. It doesn't save typing, but a more readable version would be rows - nrow(this.matrix) cols - ncol(this.matrix) this.matrix[3:rows, cols] I would have used this.matrix[-(1:2), ncol(this.matrix)] which I find much clearer as to its intentions. With this in mind, I wanted to try to implement something like this in R. It seems like that in order to be able to do this, I would have to be able to access the parse tree of the expression currently being evaluated by the interpreter from within my End function-- is this possible? Since the [ and [[ operators are primitive I can't see their arguments via the call stack functions... Anyone got a workaround? Would anybody else like to see this feature added to R? Learning to use the power of R's indexing and functios like head() and tail() (which are just syntactic sugar) will probably lead you not to miss this. I like the general rule that subexpressions have values that can be evaluated independent of context, so I don't think this is a good idea. Also, '[' is generic, so it would need to be done in such a way that it applied to all methods. As arguments other than the first are passed unevaluated to the methods, I don't think this is really possible (you don't even know if the third argument to `[` is a dimension for a method). Also, this would effectively make 'end' a reserved word, or 3:end is ambiguous (or at best context-dependent). -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] The end of Matlab
On 12/12/2008 3:41 AM, Wacek Kusnierczyk wrote: Duncan Murdoch wrote: On 11/12/2008 9:45 PM, Mike Rowe wrote: Greetings! I come to R by way of Matlab. One feature in Matlab I miss is its end keyword. When you put end inside an indexing expression, it is interpreted as the length of the variable along the dimension being indexed. For example, if the same feature were implemented in R: my.vector[5:end] would be equivalent to: my.vector[5:length(my.vector)] And if my.vector is of length less than 5? or: this.matrix[3:end,end] would be equivalent to: this.matrix[3:nrow(this.matrix),ncol(this.matrix)] # or this.matrix[3:dim(this.matrix)[1],dim(this.matrix)[2]] As you can see, the R version requires more typing, and I am a lousy typist. It doesn't save typing, but a more readable version would be rows - nrow(this.matrix) cols - ncol(this.matrix) this.matrix[3:rows, cols] and if nrow(this.matrix) is less than 3? With this in mind, I wanted to try to implement something like this in R. It seems like that in order to be able to do this, I would have to be able to access the parse tree of the expression currently being evaluated by the interpreter from within my End function-- is this possible? Since the [ and [[ operators are primitive I can't see their arguments via the call stack functions... Anyone got a workaround? Would anybody else like to see this feature added to R? I like the general rule that subexpressions have values that can be evaluated independent of context, so I don't think this is a good idea. but this 'general rule' is not really adhered to in r! one example already discussed here at length is subset: subset(data.frame(...), select=a) what will be selected? column named a, or columns named by the components of the vector a? this is an example of how you can't say what an expression means in a context-independent manner. From which you might conclude that I don't like the design of subset, and you'd be right. However, I don't think this is a counterexample to my general rule. In the subset function, the select argument is treated as an unevaluated expression, and then there are rules about what to do with it. (I.e. try to look up name `a` in the data frame, if that fails, ...) For the requested behaviour to similarly fall within the general rule, we'd have to treat all indices to all kinds of things (vectors, matrices, dataframes, etc.) as unevaluated expressions, with special handling for the particular symbol `end`. But Mike wanted an End function, so presumably he wanted the old behaviour of indexing, but to have a function whose value depended on where it was called from. We do have those (e.g. the functions for examining the stack that Mike wanted to make use of), and they're needed for debugging and a few special cases, but as a general rule they should be avoided. and this is an ubiquitous problem in r. I don't think so. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] The end of Matlab
Dear list, Learning to use the power of R's indexing and functios like head() and tail() (which are just syntactic sugar) will probably lead you not to miss this. However, how do I exclude the last columns of a data.frame or matrix (or, in general, head and tail for given dimensions of an array)? I.e. something nicer than t (head (t (x), -n)) for excluding the last n columns of matrix x THX, Claudia -- Claudia Beleites Dipartimento dei Materiali e delle Risorse Naturali Università degli Studi di Trieste Via Alfonso Valerio 6/a I-34127 Trieste phone: +39 (0 40) 5 58-34 47 email: cbelei...@units.it __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] The end of Matlab
How about: x[, -seq(to=ncol(x), length=n)] Patrick Burns patr...@burns-stat.com +44 (0)20 8525 0696 http://www.burns-stat.com (home of S Poetry and A Guide for the Unwilling S User) Claudia Beleites wrote: Dear list, Learning to use the power of R's indexing and functios like head() and tail() (which are just syntactic sugar) will probably lead you not to miss this. However, how do I exclude the last columns of a data.frame or matrix (or, in general, head and tail for given dimensions of an array)? I.e. something nicer than t (head (t (x), -n)) for excluding the last n columns of matrix x THX, Claudia __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] The end of Matlab
Duncan Murdoch wrote: On 12/12/2008 3:41 AM, Wacek Kusnierczyk wrote: but this 'general rule' is not really adhered to in r! one example already discussed here at length is subset: subset(data.frame(...), select=a) what will be selected? column named a, or columns named by the components of the vector a? this is an example of how you can't say what an expression means in a context-independent manner. From which you might conclude that I don't like the design of subset, and you'd be right. However, I don't think this is a counterexample to my general rule. In the subset function, the select argument is treated as an unevaluated expression, and then there are rules about what to do with it. (I.e. try to look up name `a` in the data frame, if that fails, ...) For the requested behaviour to similarly fall within the general rule, we'd have to treat all indices to all kinds of things (vectors, matrices, dataframes, etc.) as unevaluated expressions, with special handling for the particular symbol `end`. But Mike wanted an End function, so presumably he wanted the old behaviour of indexing, but to have a function whose value depended on where it was called from. We do have those (e.g. the functions for examining the stack that Mike wanted to make use of), and they're needed for debugging and a few special cases, but as a general rule they should be avoided. and this is an ubiquitous problem in r. I don't think so. i'd think that neither the 'evaluate' nor the 'deparse' approaches to establishing the values of arguments are particularly rare in r. vQ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] The end of Matlab
Duncan Murdoch wrote: On 11/12/2008 9:45 PM, Mike Rowe wrote: this.matrix[3:end,end] would be equivalent to: this.matrix[3:nrow(this.matrix),ncol(this.matrix)] # or this.matrix[3:dim(this.matrix)[1],dim(this.matrix)[2]] As you can see, the R version requires more typing, and I am a lousy typist. It doesn't save typing, but a more readable version would be rows - nrow(this.matrix) cols - ncol(this.matrix) this.matrix[3:rows, cols] With this in mind, I wanted to try to implement something like this in R. It seems like that in order to be able to do this, I would have to be able to access the parse tree of the expression currently being evaluated by the interpreter from within my End function-- is this possible? Since the [ and [[ operators are primitive I can't see their arguments via the call stack functions... Anyone got a workaround? Would anybody else like to see this feature added to R? I like the general rule that subexpressions have values that can be evaluated independent of context, so I don't think this is a good idea. if 'end' poses a problem to the general rule of context-free establishment of the values of expressions, the python way might be another option: x[3:] instead of x[3:length(x)] x[3:end] (modulo 0-based indexing in python) could this be considered? laziness seems to be considered a virtue here, and r is stuffed with 'features' designed by lazy programmers to avoid, e.g., typing quotes; why would not having to type 'length(...)' or 'nrows(...)' etc. be considered annoying? vQ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] The end of Matlab
From which you might conclude that I don't like the design of subset, and you'd be right. However, I don't think this is a counterexample to my general rule. In the subset function, the select argument is treated as an unevaluated expression, and then there are rules about what to do with it. (I.e. try to look up name `a` in the data frame, if that fails, ...) For the requested behaviour to similarly fall within the general rule, we'd have to treat all indices to all kinds of things (vectors, matrices, dataframes, etc.) as unevaluated expressions, with special handling for the particular symbol `end`. Except you wouldn't have to necessarily change indexing - you could change seq instead. Then 5:end could produce some kind of special data structure (maybe an iterator) that was recognised by the various indexing functions. This would still be a lot of work for not a lot of payoff, but it would be a logically consistent way of adding this behaviour to indexing, and the basic work would make it possible to develop other sorts of indexing, eg df[evens(), ], or df[last(5), last(3)]. Hadley -- http://had.co.nz/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] The end of Matlab
Am Freitag 12 Dezember 2008 13:10:20 schrieb Patrick Burns: How about: x[, -seq(to=ncol(x), length=n)] Doing it is not my problem. I just agree with Mike in that I would like if I could do shorter than: x[, 1 : (ncol(x) - n)] which I btw prefer to your solution. Also, I don't have a problem writing generalized versions of head and tail to work along other/more dimensions. Or a combined function, taking first and last arguments. Still, they would not be as convenient to use as matlab's: 3 : end - 4 which btw. also does not need parentheses. I guess the general problem is that there is only one thing with integers that can easily be (ab)used as a flag: the negative sign. But there are (at least) 2 possibly useful special ways of indexing: - exclusion (as in R) - using -n for end - n (as in perl) Now we enjoy having a shortcut for exclusion (at least I do), but still feel that marking from the end would be useful. As no other signs (in the sense of flag) are available for integers, we won't be able to stop typing somewhat more in R. Wacek: x[3:] instead of x[3:length(x)] x[3:end] I don't think that would help: what to use for end - 3 within the convention that negative values mean exclusion? --- now I start dreaming --- However, it is possible to define new binary operators (operators are great for lazy typing...). Let's say %:% should be a new operator to generate proper indexing sequences to be used inside [ : e.g. an.array [ 1:3, -2 %:% -5, ...] If we now find an.array which is x inside [ (and also inside [[) - which is possible but maybe a bit fiddly and if we can also find out which of the indices is actually evaluated (which I don't know how to do) then we could use something* as a flag for from the end and calculate the proper sequence. something* could e.g. be either an attribute to the operators (convenient if we can define an unary operator that allows setting it, e.g. § 3 [§ is the easy-to-type sign on my keyboard that is not yet used...]) or i (the imaginary one) if there is no other convenient unary operator e.g. 3i = easy part of the solution: make.index - function (x, along.dim = 1, from, to){ if (is.null (dim (x))) dim - length (x) else dim - dim (x)[along.dim] if (is.complex (from)){ from - dim - from # 0i means end ## warning if re (from) != 0 ? } if (is.complex (to)){ to - dim - to # 0i means end ## warning if re (to) != 0 ? } from : to } %:% - function (e1, e2) ## using a new operator does not mess up : make.index (x = find.x (), along.dim = find.dim (), e1, e2) now, the heavy part are the still missing find.x () and find.dim () functions... I'm not sure whether this would be worth the work, but maybe someone is around who just knows how to do this. Claudia -- Claudia Beleites Dipartimento dei Materiali e delle Risorse Naturali Università degli Studi di Trieste Via Alfonso Valerio 6/a I-34127 Trieste phone: +39 (0 40) 5 58-34 47 email: cbelei...@units.it __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] The end of Matlab
Claudia Beleites wrote: Am Freitag 12 Dezember 2008 13:10:20 schrieb Patrick Burns: How about: x[, -seq(to=ncol(x), length=n)] Doing it is not my problem. I just agree with Mike in that I would like if I could do shorter than: x[, 1 : (ncol(x) - n)] which I btw prefer to your solution. Also, I don't have a problem writing generalized versions of head and tail to work along other/more dimensions. Or a combined function, taking first and last arguments. Still, they would not be as convenient to use as matlab's: 3 : end - 4 which btw. also does not need parentheses. I guess the general problem is that there is only one thing with integers that can easily be (ab)used as a flag: the negative sign. But there are (at least) 2 possibly useful special ways of indexing: - exclusion (as in R) - using -n for end - n (as in perl) Now we enjoy having a shortcut for exclusion (at least I do), but still feel that marking from the end would be useful. As no other signs (in the sense of flag) are available for integers, we won't be able to stop typing somewhat more in R. Wacek: x[3:] instead of x[3:length(x)] x[3:end] I don't think that would help: what to use for end - 3 within the convention that negative values mean exclusion? might seem tricky, but not impossible: x[-2] # could mean 'all except for 2nd', as it is now x[1:-2] # could mean 'from start to the 2nd backwards from the end' since r disallows mixing positive and negative indexing, the above would not be ambiguous. worse with x[-3:-1] which could mean both 'except for 3rd, 2nd, and 1st' and 'from the 3rd to the 1st from the end', and so would be ambiguous. in this context, indeed, having explicit 'end' could help avoid the ambiguity. vQ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] The end of Matlab
On 12/12/2008 8:25 AM, hadley wickham wrote: From which you might conclude that I don't like the design of subset, and you'd be right. However, I don't think this is a counterexample to my general rule. In the subset function, the select argument is treated as an unevaluated expression, and then there are rules about what to do with it. (I.e. try to look up name `a` in the data frame, if that fails, ...) For the requested behaviour to similarly fall within the general rule, we'd have to treat all indices to all kinds of things (vectors, matrices, dataframes, etc.) as unevaluated expressions, with special handling for the particular symbol `end`. Except you wouldn't have to necessarily change indexing - you could change seq instead. Then 5:end could produce some kind of special data structure (maybe an iterator) that was recognised by the various indexing functions. Ummm, doesn't that require changes to *both* indexing and seq? This would still be a lot of work for not a lot of payoff, but it would be a logically consistent way of adding this behaviour to indexing, and the basic work would make it possible to develop other sorts of indexing, eg df[evens(), ], or df[last(5), last(3)]. I agree: it would be a nice addition, but a fair bit of work. I think it would be quite doable for the indexable things in the base packages, but there are a lot of contributed packages that define [ methods, and those methods would all need to be modified too. (Just to be clear, when I say doable, I'm thinking that your iterators return functions that compute subsets of index ranges. For example, evens() might be implemented as evens - function() { result - function(indices) { indices[indices %% 2 == 0] } class(result) - iterator return(result) } and then `[` in v[evens()] would recognize that it had been passed an iterator, and would pass 1:length(v) to the iterator to get the subset of even indices. Is that what you had in mind?) Duncan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] The end of Matlab
Wacek: x[3:] instead of x[3:length(x)] x[3:end] I don't think that would help: what to use for end - 3 within the convention that negative values mean exclusion? might seem tricky, but not impossible: x[-2] # could mean 'all except for 2nd', as it is now x[1:-2] # could mean 'from start to the 2nd backwards from the end' since r disallows mixing positive and negative indexing, the above would not be ambiguous. worse with x[-3:-1] which could mean both 'except for 3rd, 2nd, and 1st' and 'from the 3rd to the 1st from the end', and so would be ambiguous. in this context, indeed, having explicit 'end' could help avoid the ambiguity. on the other hand, another possible solution would be to have ':' mean, inside range selection expressions, not the usual sequence generation, but rather specification of start and end indices: x[1:2] # from 1st to 2nd, inclusive x[seq(1,2)] # same as above x[c(1,2)] # same as above x[1:-2] # from 1st to 2nd from the end, not x[c(1,0,-1,-2)] x[seq(1,-2)] # no way, mixed indices x[-2:-1] # from 2nd to 1st, both from the end, not x[c(-2,-1)] x[length(x) + -1:0] # same as above x[seq(-2,-1)] # except for 2nd and 1st x[c(-2,-1)] # same as above x[2:] # from 2nd up x[seq(2, max(2, length(x)))] # same as above (would not be without max) x[:3] # up to 3rd x[seq(1,3)] # same as above x[:-3] # up to 3rd from the end x[seq(1, length(x)-2)] # same as above with additional specifications for the behaviour in case of invalid indices and decreasing indices: x[2:1] # from the 2nd to the 1st, in reverse order # or nothing, invalid indexing which can be easily done with the unambiguous x[seq(2,1)] or x[c(2,1)] this is daydreaming, of course, because such modifications would break much old code, and the benefit may not outweigh the effort. vQ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] The end of Matlab
Here is how to emulate matlab end in R in the case of matrices. Rather than redefine the matrix class (which would be a bit intrusive) we just define a subclass of matrix called matrix2. Note in the examples that matrix2 survives some operations such as + but not others such as crossprod so in those one would have to coerce back to matrix2 using as.matrix2. as.matrix2 - function(x, ...) UseMethod(as.matrix2) as.matrix2.default - function(x, ...) { do.call(structure, list(x, ..., class = c(matrix2, setdiff(class(x), matrix2 } matrix2 - function(data, ...) as.matrix2(matrix(data, ...)) [.matrix2 - function(x, i, j, ...) { i - if (missing(i)) TRUE else eval.parent(do.call(substitute, list(substitute(i), list(end = nrow(x) j - if (missing(j)) TRUE else eval.parent(do.call(substitute, list(substitute(j), list(end = ncol(x) .subset(x, i, j, ...) } # test m - matrix2(1:12, 3, 4) # matrix2 survives the + operation class(m+2) [1] matrix2 matrix # but not crossprod class(crossprod(m)) [1] matrix # coercing back as.matrix2(crossprod(m)) [,1] [,2] [,3] [,4] [1,] 14 32 50 68 [2,] 32 77 122 167 [3,] 50 122 194 266 [4,] 68 167 266 365 attr(,class) [1] matrix2 matrix # example of using end m[2:end, 2:end] [,1] [,2] [,3] [1,]58 11 [2,]69 12 On Thu, Dec 11, 2008 at 9:45 PM, Mike Rowe mwr...@gmail.com wrote: Greetings! I come to R by way of Matlab. One feature in Matlab I miss is its end keyword. When you put end inside an indexing expression, it is interpreted as the length of the variable along the dimension being indexed. For example, if the same feature were implemented in R: my.vector[5:end] would be equivalent to: my.vector[5:length(my.vector)] or: this.matrix[3:end,end] would be equivalent to: this.matrix[3:nrow(this.matrix),ncol(this.matrix)] # or this.matrix[3:dim(this.matrix)[1],dim(this.matrix)[2]] As you can see, the R version requires more typing, and I am a lousy typist. With this in mind, I wanted to try to implement something like this in R. It seems like that in order to be able to do this, I would have to be able to access the parse tree of the expression currently being evaluated by the interpreter from within my End function-- is this possible? Since the [ and [[ operators are primitive I can't see their arguments via the call stack functions... Anyone got a workaround? Would anybody else like to see this feature added to R? Thanks, Mike __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] The end of Matlab
I just realized that my idea of doing something without going into the extraction functions itself won't work :-( it was a nice dream, though. The reason is that there is no general way to find out what the needed length is: At least I'm just writing a class where 2 kinds of columns are involved. I don't give a dim attribute, though. But I could, and then: how to know how it should be interpreted? on the other hand, another possible solution would be to have ':' mean, inside range selection expressions, not the usual sequence generation, but rather specification of start and end indices: ... this is daydreaming, of course, because such modifications would break much old code, nothing would break if some other sign instead of : would be used. Maybe something like end... and the benefit may not outweigh the effort. This might be true in any case. If I only think of how many lines of nrow, ncol, length Co I could have written instead of posting wrong proposals Claudia -- Claudia Beleites Dipartimento dei Materiali e delle Risorse Naturali Università degli Studi di Trieste Via Alfonso Valerio 6/a I-34127 Trieste phone: +39 (0 40) 5 58-34 47 email: cbelei...@units.it __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] The end of Matlab
Claudia Beleites wrote: Wacek: x[3:] instead of x[3:length(x)] x[3:end] I don't think that would help: what to use for end - 3 within the convention that negative values mean exclusion? might seem tricky, but not impossible: x[-2] # could mean 'all except for 2nd', as it is now x[1:-2] # could mean 'from start to the 2nd backwards from the end' I know you get thus far. You might even think to decide whether exclusion or 'from the end' is meant from ascending ./. descending order of the sequence, but this messes around with returning the reverse order. that's a design issue. one simple solution is to have this sort of indexing return always in ascending order. thus, x = 1:5 x[1:-1] # 1 2 3 4 5 x[5:-5] # NULL rather than 5 4 3 2 1 -- as in matlab or python x[seq(5,1)] # 5 4 3 2 1 that is, the ':'-based indexing can be made not to mess with the order. for reversing the order, why not use: x[5:-1:1] # 5 4 3 2 1 x[-3:-1:-5] # 3 2 1 rather than x[c(-3,-4,-5)], which would be 1 2 since r disallows mixing positive and negative indexing, the above would not be ambiguous. worse with x[-3:-1] which could mean both 'except for 3rd, 2nd, and 1st' and 'from the 3rd to the 1st from the end', and so would be ambiguous. in this context, indeed, having explicit 'end' could help avoid the ambiguity. that's the problem. also: how would 'except from the 5th last to the 3rd last' be expressed? for exclusions you'd need to use negative indices anyway: x[seq(-5,-3)] now, neither x[-5:-3] nor x[-3:-5] would do the job they do now, but the above is not particularly longer, while selecting the 5th-to3rd-from-the-end columns is simply x[-5:-3] (which could be made to fail on out-of-range indices) instead of something like x[length(x) - 4:2] (which will silently do the wrong thing if length(x) 4, and thus requires extra care). this is a rather loose idea, and unrealistic in the context of r, but i do not see much problem with it on the conceptual level. vQ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] The end of Matlab
Just to muddy the waters a bit further. Currently we can do things like: pascal.tri - numeric(0) class(pascal.tri) - 'pasctri' `[.pasctri` - function(x, ...) { + dots - list(...) + n - dots[[1]] + row - choose(n, 0:n) + if(length(dots) 1) { + row - row[ dots[[2]] ] + } + row + } pascal.tri[4] [1] 1 4 6 4 1 pascal.tri[4,2] [1] 4 Now whether that is clever or abusive, I'm not sure (probably not clever). But what would we expect: pascal.tri[end] to return? Also if we can access the last element of a vector as: x[end] (which I am not opposed to, just don't know if it is worth the effort) then how long will it be before someone wants to be able to do: x[end+1] - new.value and put that in a loop, which would lead to very poor programming practice (but so easy it would tempt many). Just my $0.015 worth, -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- project.org] On Behalf Of Wacek Kusnierczyk Sent: Friday, December 12, 2008 8:57 AM To: claudia.belei...@gmx.de Cc: R help Subject: Re: [R] The end of Matlab Claudia Beleites wrote: Wacek: x[3:] instead of x[3:length(x)] x[3:end] I don't think that would help: what to use for end - 3 within the convention that negative values mean exclusion? might seem tricky, but not impossible: x[-2] # could mean 'all except for 2nd', as it is now x[1:-2] # could mean 'from start to the 2nd backwards from the end' I know you get thus far. You might even think to decide whether exclusion or 'from the end' is meant from ascending ./. descending order of the sequence, but this messes around with returning the reverse order. that's a design issue. one simple solution is to have this sort of indexing return always in ascending order. thus, x = 1:5 x[1:-1] # 1 2 3 4 5 x[5:-5] # NULL rather than 5 4 3 2 1 -- as in matlab or python x[seq(5,1)] # 5 4 3 2 1 that is, the ':'-based indexing can be made not to mess with the order. for reversing the order, why not use: x[5:-1:1] # 5 4 3 2 1 x[-3:-1:-5] # 3 2 1 rather than x[c(-3,-4,-5)], which would be 1 2 since r disallows mixing positive and negative indexing, the above would not be ambiguous. worse with x[-3:-1] which could mean both 'except for 3rd, 2nd, and 1st' and 'from the 3rd to the 1st from the end', and so would be ambiguous. in this context, indeed, having explicit 'end' could help avoid the ambiguity. that's the problem. also: how would 'except from the 5th last to the 3rd last' be expressed? for exclusions you'd need to use negative indices anyway: x[seq(-5,-3)] now, neither x[-5:-3] nor x[-3:-5] would do the job they do now, but the above is not particularly longer, while selecting the 5th-to3rd-from-the-end columns is simply x[-5:-3] (which could be made to fail on out-of-range indices) instead of something like x[length(x) - 4:2] (which will silently do the wrong thing if length(x) 4, and thus requires extra care). this is a rather loose idea, and unrealistic in the context of r, but i do not see much problem with it on the conceptual level. vQ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] The end of Matlab
On Fri, Dec 12, 2008 at 8:41 AM, Duncan Murdoch murd...@stats.uwo.ca wrote: On 12/12/2008 8:25 AM, hadley wickham wrote: From which you might conclude that I don't like the design of subset, and you'd be right. However, I don't think this is a counterexample to my general rule. In the subset function, the select argument is treated as an unevaluated expression, and then there are rules about what to do with it. (I.e. try to look up name `a` in the data frame, if that fails, ...) For the requested behaviour to similarly fall within the general rule, we'd have to treat all indices to all kinds of things (vectors, matrices, dataframes, etc.) as unevaluated expressions, with special handling for the particular symbol `end`. Except you wouldn't have to necessarily change indexing - you could change seq instead. Then 5:end could produce some kind of special data structure (maybe an iterator) that was recognised by the various indexing functions. Ummm, doesn't that require changes to *both* indexing and seq? Ooops, yes. I meant it wouldn't require indexing to use unevaluated expression. This would still be a lot of work for not a lot of payoff, but it would be a logically consistent way of adding this behaviour to indexing, and the basic work would make it possible to develop other sorts of indexing, eg df[evens(), ], or df[last(5), last(3)]. I agree: it would be a nice addition, but a fair bit of work. I think it would be quite doable for the indexable things in the base packages, but there are a lot of contributed packages that define [ methods, and those methods would all need to be modified too. That's true, although I suspect many contributed [.methods eventually delegate to base methods and might work without further modification. (Just to be clear, when I say doable, I'm thinking that your iterators return functions that compute subsets of index ranges. For example, evens() might be implemented as evens - function() { result - function(indices) { indices[indices %% 2 == 0] } class(result) - iterator return(result) } and then `[` in v[evens()] would recognize that it had been passed an iterator, and would pass 1:length(v) to the iterator to get the subset of even indices. Is that what you had in mind?) Yes, that's exactly what I was thinking, although you'd have to put some thought into the conventions - would it be better to pass in the length of the vector instead of a vector of indices? Should all iterators return logical vectors? That way you could do x[evens() last(5)] to get the even indices out of the last 5, as opposed to x[evens()][last(5)] which would return the last 5 even indices. You could also imagine similar iterators for random sampling, like samp(0.2) to choose 20% of the indices, or boot(0.8) to choose 80% with replacement. first(n) could also be useful, selecting the first min(n, length(vector)) observations. An iterator version of rev() would also be handy. Maybe selector would be a better name than iterator though, as these don't have the same feel as iterators in other languages. Hadley -- http://had.co.nz/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] The end of Matlab
On Fri, 12 Dec 2008 17:38:13 +0100, hadley wickham h.wick...@gmail.com wrote: You could also imagine similar iterators for random sampling, like samp(0.2) to choose 20% of the indices, or boot(0.8) to choose 80% with replacement. first(n) could also be useful, selecting the first min(n, length(vector)) observations. An iterator version of rev() would also be handy. Maybe selector would be a better name than iterator though, as these don't have the same feel as iterators in other languages. That is really something!! Real high level language!! Selectors could depend on named variables in data frame as well: mtcars[sel(cyl3)last(5)] mtcars[sel(cyl3)boot(80%)] or may be just mtcars[cyl3last(20)] or this is already too far? VS. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] The end of Matlab
On 12/12/2008 11:38 AM, hadley wickham wrote: On Fri, Dec 12, 2008 at 8:41 AM, Duncan Murdoch murd...@stats.uwo.ca wrote: On 12/12/2008 8:25 AM, hadley wickham wrote: From which you might conclude that I don't like the design of subset, and you'd be right. However, I don't think this is a counterexample to my general rule. In the subset function, the select argument is treated as an unevaluated expression, and then there are rules about what to do with it. (I.e. try to look up name `a` in the data frame, if that fails, ...) For the requested behaviour to similarly fall within the general rule, we'd have to treat all indices to all kinds of things (vectors, matrices, dataframes, etc.) as unevaluated expressions, with special handling for the particular symbol `end`. Except you wouldn't have to necessarily change indexing - you could change seq instead. Then 5:end could produce some kind of special data structure (maybe an iterator) that was recognised by the various indexing functions. Ummm, doesn't that require changes to *both* indexing and seq? Ooops, yes. I meant it wouldn't require indexing to use unevaluated expression. This would still be a lot of work for not a lot of payoff, but it would be a logically consistent way of adding this behaviour to indexing, and the basic work would make it possible to develop other sorts of indexing, eg df[evens(), ], or df[last(5), last(3)]. I agree: it would be a nice addition, but a fair bit of work. I think it would be quite doable for the indexable things in the base packages, but there are a lot of contributed packages that define [ methods, and those methods would all need to be modified too. That's true, although I suspect many contributed [.methods eventually delegate to base methods and might work without further modification. (Just to be clear, when I say doable, I'm thinking that your iterators return functions that compute subsets of index ranges. For example, evens() might be implemented as evens - function() { result - function(indices) { indices[indices %% 2 == 0] } class(result) - iterator return(result) } and then `[` in v[evens()] would recognize that it had been passed an iterator, and would pass 1:length(v) to the iterator to get the subset of even indices. Is that what you had in mind?) Yes, that's exactly what I was thinking, although you'd have to put some thought into the conventions - would it be better to pass in the length of the vector instead of a vector of indices? Should all iterators return logical vectors? That way you could do x[evens() last(5)] to get the even indices out of the last 5, as opposed to x[evens()][last(5)] which would return the last 5 even indices. Actually, I don't think so. evens() last(5) would fail to evaluate, because you're trying to do a logical combination of two functions, not of two logical vectors. Or are we going to extend the logical operators to work on iterators/selectors too? Duncan Murdoch You could also imagine similar iterators for random sampling, like samp(0.2) to choose 20% of the indices, or boot(0.8) to choose 80% with replacement. first(n) could also be useful, selecting the first min(n, length(vector)) observations. An iterator version of rev() would also be handy. Maybe selector would be a better name than iterator though, as these don't have the same feel as iterators in other languages. Hadley __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] The end of Matlab
On Fri, Dec 12, 2008 at 11:18 AM, Duncan Murdoch murd...@stats.uwo.ca wrote: On 12/12/2008 11:38 AM, hadley wickham wrote: On Fri, Dec 12, 2008 at 8:41 AM, Duncan Murdoch murd...@stats.uwo.ca wrote: On 12/12/2008 8:25 AM, hadley wickham wrote: From which you might conclude that I don't like the design of subset, and you'd be right. However, I don't think this is a counterexample to my general rule. In the subset function, the select argument is treated as an unevaluated expression, and then there are rules about what to do with it. (I.e. try to look up name `a` in the data frame, if that fails, ...) For the requested behaviour to similarly fall within the general rule, we'd have to treat all indices to all kinds of things (vectors, matrices, dataframes, etc.) as unevaluated expressions, with special handling for the particular symbol `end`. Except you wouldn't have to necessarily change indexing - you could change seq instead. Then 5:end could produce some kind of special data structure (maybe an iterator) that was recognised by the various indexing functions. Ummm, doesn't that require changes to *both* indexing and seq? Ooops, yes. I meant it wouldn't require indexing to use unevaluated expression. This would still be a lot of work for not a lot of payoff, but it would be a logically consistent way of adding this behaviour to indexing, and the basic work would make it possible to develop other sorts of indexing, eg df[evens(), ], or df[last(5), last(3)]. I agree: it would be a nice addition, but a fair bit of work. I think it would be quite doable for the indexable things in the base packages, but there are a lot of contributed packages that define [ methods, and those methods would all need to be modified too. That's true, although I suspect many contributed [.methods eventually delegate to base methods and might work without further modification. (Just to be clear, when I say doable, I'm thinking that your iterators return functions that compute subsets of index ranges. For example, evens() might be implemented as evens - function() { result - function(indices) { indices[indices %% 2 == 0] } class(result) - iterator return(result) } and then `[` in v[evens()] would recognize that it had been passed an iterator, and would pass 1:length(v) to the iterator to get the subset of even indices. Is that what you had in mind?) Yes, that's exactly what I was thinking, although you'd have to put some thought into the conventions - would it be better to pass in the length of the vector instead of a vector of indices? Should all iterators return logical vectors? That way you could do x[evens() last(5)] to get the even indices out of the last 5, as opposed to x[evens()][last(5)] which would return the last 5 even indices. Actually, I don't think so. evens() last(5) would fail to evaluate, because you're trying to do a logical combination of two functions, not of two logical vectors. Or are we going to extend the logical operators to work on iterators/selectors too? Oh yes, that's a good point. But wouldn't the following do the job? .selector - function(a, b) { function(n) a(n) b(n) } or .selector - function(a, b) { function(n) intersect(a(n), b(n)) } depending on whether selectors return logical or numeric vectors. Writing functions for | and ! would be similarly easy. Or am I missing something? Hadley -- http://had.co.nz/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] The end of Matlab
On Fri, Dec 12, 2008 at 11:11 AM, Vitalie Spinu vitosm...@rambler.ru wrote: On Fri, 12 Dec 2008 17:38:13 +0100, hadley wickham h.wick...@gmail.com wrote: You could also imagine similar iterators for random sampling, like samp(0.2) to choose 20% of the indices, or boot(0.8) to choose 80% with replacement. first(n) could also be useful, selecting the first min(n, length(vector)) observations. An iterator version of rev() would also be handy. Maybe selector would be a better name than iterator though, as these don't have the same feel as iterators in other languages. That is really something!! Real high level language!! Selectors could depend on named variables in data frame as well: mtcars[sel(cyl3)last(5)] mtcars[sel(cyl3)boot(80%)] or may be just mtcars[cyl3last(20)] or this is already too far? This would be a considerable extension because then the selector would need to know about all other variables in the dataset, and you'd need someway of combining selectors with logical vectors. So it would be a huge increase in complexity for not much gain, given that with just the interface we have described you could do: mtcars[mtcars$cyl 3, ][last(20), ] # or subset(mtcars, cyl 3)[last(20), ] The main idea of selectors is that they would be independent of the data structure that they are being used with. Hadley -- http://had.co.nz/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] The end of Matlab
On Fri, Dec 12, 2008 at 12:11 PM, Vitalie Spinu vitosm...@rambler.ru wrote: On Fri, 12 Dec 2008 17:38:13 +0100, hadley wickham h.wick...@gmail.com wrote: You could also imagine similar iterators for random sampling, like samp(0.2) to choose 20% of the indices, or boot(0.8) to choose 80% with replacement. first(n) could also be useful, selecting the first min(n, length(vector)) observations. An iterator version of rev() would also be handy. Maybe selector would be a better name than iterator though, as these don't have the same feel as iterators in other languages. That is really something!! Real high level language!! Selectors could depend on named variables in data frame as well: mtcars[sel(cyl3)last(5)] mtcars[sel(cyl3)boot(80%)] or may be just mtcars[cyl3last(20)] You can do this (and quite a bit more) in data.table: library(data.table) mtcars.dt - as.data.table(mtcars) tail(mtcars.dt[cyl 5], 4) mpg cyl disp hp dratwt qsec vs am gear carb [1,] 27.3 4 79.0 66 4.08 1.935 18.9 1 141 [2,] 26.0 4 120.3 91 4.43 2.140 16.7 0 152 [3,] 30.4 4 95.1 113 3.77 1.513 16.9 1 152 [4,] 21.4 4 121.0 109 4.11 2.780 18.6 1 142 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] The end of Matlab
On 12/12/2008 12:23 PM, hadley wickham wrote: On Fri, Dec 12, 2008 at 11:18 AM, Duncan Murdoch murd...@stats.uwo.ca wrote: On 12/12/2008 11:38 AM, hadley wickham wrote: On Fri, Dec 12, 2008 at 8:41 AM, Duncan Murdoch murd...@stats.uwo.ca wrote: On 12/12/2008 8:25 AM, hadley wickham wrote: From which you might conclude that I don't like the design of subset, and you'd be right. However, I don't think this is a counterexample to my general rule. In the subset function, the select argument is treated as an unevaluated expression, and then there are rules about what to do with it. (I.e. try to look up name `a` in the data frame, if that fails, ...) For the requested behaviour to similarly fall within the general rule, we'd have to treat all indices to all kinds of things (vectors, matrices, dataframes, etc.) as unevaluated expressions, with special handling for the particular symbol `end`. Except you wouldn't have to necessarily change indexing - you could change seq instead. Then 5:end could produce some kind of special data structure (maybe an iterator) that was recognised by the various indexing functions. Ummm, doesn't that require changes to *both* indexing and seq? Ooops, yes. I meant it wouldn't require indexing to use unevaluated expression. This would still be a lot of work for not a lot of payoff, but it would be a logically consistent way of adding this behaviour to indexing, and the basic work would make it possible to develop other sorts of indexing, eg df[evens(), ], or df[last(5), last(3)]. I agree: it would be a nice addition, but a fair bit of work. I think it would be quite doable for the indexable things in the base packages, but there are a lot of contributed packages that define [ methods, and those methods would all need to be modified too. That's true, although I suspect many contributed [.methods eventually delegate to base methods and might work without further modification. (Just to be clear, when I say doable, I'm thinking that your iterators return functions that compute subsets of index ranges. For example, evens() might be implemented as evens - function() { result - function(indices) { indices[indices %% 2 == 0] } class(result) - iterator return(result) } and then `[` in v[evens()] would recognize that it had been passed an iterator, and would pass 1:length(v) to the iterator to get the subset of even indices. Is that what you had in mind?) Yes, that's exactly what I was thinking, although you'd have to put some thought into the conventions - would it be better to pass in the length of the vector instead of a vector of indices? Should all iterators return logical vectors? That way you could do x[evens() last(5)] to get the even indices out of the last 5, as opposed to x[evens()][last(5)] which would return the last 5 even indices. Actually, I don't think so. evens() last(5) would fail to evaluate, because you're trying to do a logical combination of two functions, not of two logical vectors. Or are we going to extend the logical operators to work on iterators/selectors too? Oh yes, that's a good point. But wouldn't the following do the job? .selector - function(a, b) { function(n) a(n) b(n) } or .selector - function(a, b) { function(n) intersect(a(n), b(n)) } depending on whether selectors return logical or numeric vectors. Writing functions for | and ! would be similarly easy. Or am I missing something? No, I think those definitions would be fine, but I'd be concerned about speed issues if we start messing with primitives. While we're at it, we might as well do the same sort of thing for :, and define a selector named end, and then 3:end would give a selector from 3 to the end, which brings us back to the original question. So it's not nearly as intrusive as I thought it would be. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] The end of Matlab
Oh yes, that's a good point. But wouldn't the following do the job? .selector - function(a, b) { function(n) a(n) b(n) } or .selector - function(a, b) { function(n) intersect(a(n), b(n)) } depending on whether selectors return logical or numeric vectors. Writing functions for | and ! would be similarly easy. Or am I missing something? No, I think those definitions would be fine, but I'd be concerned about speed issues if we start messing with primitives. Speed or expressiveness: pick one? ;) People could always use the regular subsetting mechanisms if they want the best speed - any changes to support selectors wouldn't affect the speed of the other methods of subsetting, would they? While we're at it, we might as well do the same sort of thing for :, and define a selector named end, and then 3:end would give a selector from 3 to the end, which brings us back to the original question. So it's not nearly as intrusive as I thought it would be. 3:end() do you mean? Or do you mean extending seq so that it uses unevaluted input? Hadley -- http://had.co.nz/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] The end of Matlab
On 12/12/2008 1:06 PM, hadley wickham wrote: Oh yes, that's a good point. But wouldn't the following do the job? .selector - function(a, b) { function(n) a(n) b(n) } or .selector - function(a, b) { function(n) intersect(a(n), b(n)) } depending on whether selectors return logical or numeric vectors. Writing functions for | and ! would be similarly easy. Or am I missing something? No, I think those definitions would be fine, but I'd be concerned about speed issues if we start messing with primitives. Speed or expressiveness: pick one? ;) People could always use the regular subsetting mechanisms if they want the best speed - any changes to support selectors wouldn't affect the speed of the other methods of subsetting, would they? While we're at it, we might as well do the same sort of thing for :, and define a selector named end, and then 3:end would give a selector from 3 to the end, which brings us back to the original question. So it's not nearly as intrusive as I thought it would be. 3:end() do you mean? Or do you mean extending seq so that it uses unevaluted input? My end would be the output of your end(). If there are no args and no local context, I don't see the need for it to be a function call. It would just be defined as something like end - structure( function(n) c(rep(FALSE, n-1), TRUE), class=selector) I'm not sure what the definition of : should be if one of the args is a selector. Duncan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] The end of Matlab
evens() last(5) wouldn't x[evens()][last(5)] do the already? or is different, though. Claudia -- Claudia Beleites Dipartimento dei Materiali e delle Risorse Naturali Università degli Studi di Trieste Via Alfonso Valerio 6/a I-34127 Trieste phone: +39 (0 40) 5 58-34 47 email: cbelei...@units.it __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] The end of Matlab
That depends on what you want evens() last(5) to mean. Does that mean the last 5 evens (returning 5 values) or the values in the last 5 that are also even items (returning either 2 or 3 values depending on if the structure has an odd or even number of elements). It could be interpreted either way. Your subset below does the first, the other examples do the 2nd. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- project.org] On Behalf Of Claudia Beleites Sent: Friday, December 12, 2008 11:38 AM To: r-help@r-project.org Subject: Re: [R] The end of Matlab evens() last(5) wouldn't x[evens()][last(5)] do the already? or is different, though. Claudia -- Claudia Beleites Dipartimento dei Materiali e delle Risorse Naturali Università degli Studi di Trieste Via Alfonso Valerio 6/a I-34127 Trieste phone: +39 (0 40) 5 58-34 47 email: cbelei...@units.it __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] The end of Matlab
My end would be the output of your end(). If there are no args and no local context, I don't see the need for it to be a function call. It would just be defined as something like end - structure( function(n) c(rep(FALSE, n-1), TRUE), class=selector) Oh, I see what you mean. I'm not sure what the definition of : should be if one of the args is a selector. Alternatively you could use !first(2), and only use end/last when you want to select the last n observations. Of course !first(2) would be the equivalent to -(1:2) so there's not much savings there. Hadley -- http://had.co.nz/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] The end of Matlab
On Fri, 12 Dec 2008 18:27:02 +0100, hadley wickham h.wick...@gmail.com wrote: or may be just mtcars[cyl3last(20)] or this is already too far? This would be a considerable extension because then the selector would need to know about all other variables in the dataset, and you'd need someway of combining selectors with logical vectors. If selector returns a logical vector then I really don't see where is the problem. Probably I am mistaken but implementing mtcars[cyl3] is not such a big deal. Just an operator `[.` start searching for cyl from inside the x frame and not from parent.frame as it does now. It is just like putting with inside '[', or not? When started with R I was really disappointed that such a natural and intuitive subsetting is not allowed, but instead lengthy and ackward mtcars[mtcars$syl3] is required. R is an interactive language for 99% of the users and features like that(and selectors indeed) would make a tremendous difference. Regards, SV. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] The end of Matlab
On Fri, Dec 12, 2008 at 3:08 PM, Vitalie Spinu vitosm...@rambler.ru wrote: On Fri, 12 Dec 2008 18:27:02 +0100, hadley wickham h.wick...@gmail.com wrote: or may be just mtcars[cyl3last(20)] or this is already too far? This would be a considerable extension because then the selector would need to know about all other variables in the dataset, and you'd need someway of combining selectors with logical vectors. If selector returns a logical vector then I really don't see where is the problem. Probably I am mistaken but implementing mtcars[cyl3] is not such a big deal. Just an operator `[.` start searching for cyl from inside the x frame and not from parent.frame as it does now. It is just like putting with inside '[', or not? And that's a big change to the current behaviour! I think there are a few good reasons why this shouldn't be the default: * You could no longer do: cyl - 4; mtcars[mtcars$cyl == cyl, ] (which is very useful when writing function) * If you want that behaviour, then just use subset * It only makes sense for variables of data frames, not for all the other types of subsets * Generally it's better to be explicit than not When started with R I was really disappointed that such a natural and intuitive subsetting is not allowed, but instead lengthy and ackward mtcars[mtcars$syl3] is required. R is an interactive language for 99% of the users and features like that(and selectors indeed) would make a tremendous difference. Regards, SV. -- http://had.co.nz/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] The end of Matlab
Greetings! I come to R by way of Matlab. One feature in Matlab I miss is its end keyword. When you put end inside an indexing expression, it is interpreted as the length of the variable along the dimension being indexed. For example, if the same feature were implemented in R: my.vector[5:end] would be equivalent to: my.vector[5:length(my.vector)] or: this.matrix[3:end,end] would be equivalent to: this.matrix[3:nrow(this.matrix),ncol(this.matrix)] # or this.matrix[3:dim(this.matrix)[1],dim(this.matrix)[2]] As you can see, the R version requires more typing, and I am a lousy typist. With this in mind, I wanted to try to implement something like this in R. It seems like that in order to be able to do this, I would have to be able to access the parse tree of the expression currently being evaluated by the interpreter from within my End function-- is this possible? Since the [ and [[ operators are primitive I can't see their arguments via the call stack functions... Anyone got a workaround? Would anybody else like to see this feature added to R? Thanks, Mike __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] The end of Matlab
Use tail and head. See interspersed. On Thu, Dec 11, 2008 at 9:45 PM, Mike Rowe mwr...@gmail.com wrote: Greetings! I come to R by way of Matlab. One feature in Matlab I miss is its end keyword. When you put end inside an indexing expression, it is interpreted as the length of the variable along the dimension being indexed. For example, if the same feature were implemented in R: my.vector[5:end] tail(my.vector, -4) would be equivalent to: my.vector[5:length(my.vector)] or: this.matrix[3:end,end] tail(tail(this.matrix, -2), 1) would be equivalent to: this.matrix[3:nrow(this.matrix),ncol(this.matrix)] # or this.matrix[3:dim(this.matrix)[1],dim(this.matrix)[2]] As you can see, the R version requires more typing, and I am a lousy typist. With this in mind, I wanted to try to implement something like this in R. It seems like that in order to be able to do this, I would have to be able to access the parse tree of the expression currently being evaluated by the interpreter from within my End function-- is this possible? Since the [ and [[ operators are primitive I can't see their arguments via the call stack functions... Anyone got a workaround? Would anybody else like to see this feature added to R? Thanks, Mike __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] The end of Matlab
Its been pointed out to me that the second one is wrong. It should be: tail(this.matrix, -2)[, ncol(this.matrix)] which is not as compact as matlab or my prior post but still not particularly onerous. On Thu, Dec 11, 2008 at 11:49 PM, Gabor Grothendieck ggrothendi...@gmail.com wrote: Use tail and head. See interspersed. On Thu, Dec 11, 2008 at 9:45 PM, Mike Rowe mwr...@gmail.com wrote: Greetings! I come to R by way of Matlab. One feature in Matlab I miss is its end keyword. When you put end inside an indexing expression, it is interpreted as the length of the variable along the dimension being indexed. For example, if the same feature were implemented in R: my.vector[5:end] tail(my.vector, -4) would be equivalent to: my.vector[5:length(my.vector)] or: this.matrix[3:end,end] tail(tail(this.matrix, -2), 1) would be equivalent to: this.matrix[3:nrow(this.matrix),ncol(this.matrix)] # or this.matrix[3:dim(this.matrix)[1],dim(this.matrix)[2]] As you can see, the R version requires more typing, and I am a lousy typist. With this in mind, I wanted to try to implement something like this in R. It seems like that in order to be able to do this, I would have to be able to access the parse tree of the expression currently being evaluated by the interpreter from within my End function-- is this possible? Since the [ and [[ operators are primitive I can't see their arguments via the call stack functions... Anyone got a workaround? Would anybody else like to see this feature added to R? Thanks, Mike __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] The end of Matlab
On 11/12/2008 9:45 PM, Mike Rowe wrote: Greetings! I come to R by way of Matlab. One feature in Matlab I miss is its end keyword. When you put end inside an indexing expression, it is interpreted as the length of the variable along the dimension being indexed. For example, if the same feature were implemented in R: my.vector[5:end] would be equivalent to: my.vector[5:length(my.vector)] And if my.vector is of length less than 5? or: this.matrix[3:end,end] would be equivalent to: this.matrix[3:nrow(this.matrix),ncol(this.matrix)] # or this.matrix[3:dim(this.matrix)[1],dim(this.matrix)[2]] As you can see, the R version requires more typing, and I am a lousy typist. It doesn't save typing, but a more readable version would be rows - nrow(this.matrix) cols - ncol(this.matrix) this.matrix[3:rows, cols] With this in mind, I wanted to try to implement something like this in R. It seems like that in order to be able to do this, I would have to be able to access the parse tree of the expression currently being evaluated by the interpreter from within my End function-- is this possible? Since the [ and [[ operators are primitive I can't see their arguments via the call stack functions... Anyone got a workaround? Would anybody else like to see this feature added to R? I like the general rule that subexpressions have values that can be evaluated independent of context, so I don't think this is a good idea. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.