Re: [R] sapply returning list instead of matrix
I can read the documentation, I see why it happens, but who in their right mind would design a function this way? I think you're possibly starting from the wrong perspective, or at least it might be useful to look at it from a different perspective. In many cases, such as simulations, lapply returns a list of identical-length vectors that, for subsequent purposes, would be more convenient if simplified to a vector or matrix, and that's an extra step or two. sapply is the answer to wouldn't it be nice if lapply simplified things for me if it were possible? Now, if your function does something unexpected and returns uneven lengths, that's actually easier to catch if the return type changes (consider: a function expected to return a length 5 vector could return a length one NA for some input, probably with warning; that would cause the current sapply to return a list and subsequent statements expecting a matrix or vector would grind to a halt. This makes it quite hard for bugs to go undetected. Forcing sapply to pad to the same length to guarantee an array would hide that, your script would continue to run and you'd be none the wiser until much later. Bugs could _more_ easily get into production code. And of course, it is pretty much trivial to test for the correct type on return, using is.list etc, so it's a readily trappable behaviour as long as you plan for it. S *** This email and any attachments are confidential. Any use...{{dropped:8}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sapply returning list instead of matrix
Can I follow-up with what I've learned about my own myopia regarding sapply()? First, I appreciate all the feedback. After thinking about it for a while I realized R designers have often chosen to accommodate interactive usage, and in that context, sapply() returning different types makes perfect sense. If applying both 'mean' and 'var' to multiple data sets in a list, it makes sense to return a matrix, but if applying just 'mean' the same list of data sets it makes sense to return a list, not a 1xN matrix. This works well in an interactive context but when writing robust applications, it is essential that routines return consistent types, especially if the parameters are determined from unpredictable user input. The behavior of functions like sapply() in R seems extraordinary compared to languages I am more familiar with like C, Java, or Python. In my case I was using sapply() to extract alignments from multiple BAM files that overlap exons of a gene.My application of sapply() returned a matrix with data sets across columns and exons down the rows. This worked well for most genes, but failed when run on a gene with only a single exon because sapply() returned a list instead of a matrix. This bug in my code was just waiting for the right set of inputs to trigger it. [ Some suggested using vapply() but don't think that would help in this case because the length of the return value from the applied function is variable and depends on how many exons are in the gene. Or perhaps I just don't understand vapply well. ] sapply() is behaving very similarly to the way the '[' and '[[' operators treat data frames. The extract operator '[' returns a vector when extracting a single column from a data frame, otherwise it returns a data frame.However both '[' and '[[' take a 'drop' parameter to control this behavior so you can get a consistent type back if you need it. I wish sapply() had a similar option. -csw __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] sapply returning list instead of matrix
Can anyone suggest a rationale for why sapply() returns different types (list and matrix) in the two examples below? Is there any way to get sapply() or any other apply() function to return a matrix in both cases? simplify=TRUE doesn't change the outcome. I understand why it is happening, I just can't understand why such unpredictable behavior makes sense. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sapply returning list instead of matrix
As you ignored the posting guide and posted in HTML, your below didn't get through. So one can only guess that it has something to do with (see ?sapply) Simplification in sapply is only attempted if X has length greater than zero and if the return values from all elements of X are all of the same (positive) length. If the common length is one the result is a vector, and if greater than one is a matrix with a column corresponding to each element of X. Return values most also be of the same type, also, obviously. Cheers, Bert Bert Gunter Genentech Nonclinical Biostatistics (650) 467-7374 Data is not information. Information is not knowledge. And knowledge is certainly not wisdom. H. Gilbert Welch On Fri, Jan 31, 2014 at 1:36 PM, chris warth cswa...@gmail.com wrote: Can anyone suggest a rationale for why sapply() returns different types (list and matrix) in the two examples below? Is there any way to get sapply() or any other apply() function to return a matrix in both cases? simplify=TRUE doesn't change the outcome. I understand why it is happening, I just can't understand why such unpredictable behavior makes sense. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sapply returning list instead of matrix
Hey thanks for the helpful snark, Bert. To everyone else, I apologize for neglecting to actually include the examples. a - function(i) { list(1) } b - function(i) { list(1,2) } ll - sapply(seq(3), a, simplfy=list) mm - sapply(seq(3), b) class(ll) class(mm) class(ll) [1] list class(mm) [1] matrix I can read the documentation, I see why it happens, but who in their right mind would design a function this way? Can you imagine how many bugs are lurking because people haven't yet hit the right set of input that is going to cause sapply() to return a list instead of a matrix(). The point is that having the type of return value depend on the length of output from the applied function is simply madness. It is a terrible design decision. What is to be gained from the fact that I have to test the type of value returned from sapply()? I was hoping plyr::laply() would be better but it perpetuates the same bad interface. [so sorry for sending html, if that is what's happening. I guess gmail send html by default? ] On Fri, Jan 31, 2014 at 1:44 PM, Bert Gunter gunter.ber...@gene.com wrote: As you ignored the posting guide and posted in HTML, your below didn't get through. So one can only guess that it has something to do with (see ?sapply) Simplification in sapply is only attempted if X has length greater than zero and if the return values from all elements of X are all of the same (positive) length. If the common length is one the result is a vector, and if greater than one is a matrix with a column corresponding to each element of X. Return values most also be of the same type, also, obviously. Cheers, Bert Bert Gunter Genentech Nonclinical Biostatistics (650) 467-7374 Data is not information. Information is not knowledge. And knowledge is certainly not wisdom. H. Gilbert Welch On Fri, Jan 31, 2014 at 1:36 PM, chris warth cswa...@gmail.com wrote: Can anyone suggest a rationale for why sapply() returns different types (list and matrix) in the two examples below? Is there any way to get sapply() or any other apply() function to return a matrix in both cases? simplify=TRUE doesn't change the outcome. I understand why it is happening, I just can't understand why such unpredictable behavior makes sense. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sapply returning list instead of matrix
I can read the documentation, I see why it happens, but who in their right mind would design a function this way? Can you imagine how many bugs are lurking because people haven't yet hit the right set of input that is going to cause sapply() to return a list instead of a matrix(). If you always want a list output use lapply(). If you want the simplification that sapply does, but with sanity checks, use vapply(). vapply() lets you assert the type and size of FUN's return value. If all goes well it returns what sapply() would return but it throws an error if any call to FUN returns something unexpected. (Also, if length(X) is 0, vapply makes the output be a zero-length object of the appropriate type.) vapply(1:3, FUN=seq_along, FUN.VALUE=1L) [1] 1 1 1 vapply(1:3, FUN=range, FUN.VALUE=c(0,0)) [,1] [,2] [,3] [1,]123 [2,]123 vapply(1:3, FUN=seq, FUN.VALUE=1L) Error in vapply(1:3, FUN = seq, FUN.VALUE = 1L) : values must be length 1, but FUN(X[[2]]) result is length 2 vapply(numeric(0), FUN=range, FUN.VALUE=c(0,0)) # returns 2 by 0 numeric matrix [1,] [2,] Bill Dunlap TIBCO Software wdunlap tibco.com -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of chris warth Sent: Friday, January 31, 2014 2:22 PM To: r-help@r-project.org Subject: Re: [R] sapply returning list instead of matrix Hey thanks for the helpful snark, Bert. To everyone else, I apologize for neglecting to actually include the examples. a - function(i) { list(1) } b - function(i) { list(1,2) } ll - sapply(seq(3), a, simplfy=list) mm - sapply(seq(3), b) class(ll) class(mm) class(ll) [1] list class(mm) [1] matrix I can read the documentation, I see why it happens, but who in their right mind would design a function this way? Can you imagine how many bugs are lurking because people haven't yet hit the right set of input that is going to cause sapply() to return a list instead of a matrix(). The point is that having the type of return value depend on the length of output from the applied function is simply madness. It is a terrible design decision. What is to be gained from the fact that I have to test the type of value returned from sapply()? I was hoping plyr::laply() would be better but it perpetuates the same bad interface. [so sorry for sending html, if that is what's happening. I guess gmail send html by default? ] On Fri, Jan 31, 2014 at 1:44 PM, Bert Gunter gunter.ber...@gene.com wrote: As you ignored the posting guide and posted in HTML, your below didn't get through. So one can only guess that it has something to do with (see ?sapply) Simplification in sapply is only attempted if X has length greater than zero and if the return values from all elements of X are all of the same (positive) length. If the common length is one the result is a vector, and if greater than one is a matrix with a column corresponding to each element of X. Return values most also be of the same type, also, obviously. Cheers, Bert Bert Gunter Genentech Nonclinical Biostatistics (650) 467-7374 Data is not information. Information is not knowledge. And knowledge is certainly not wisdom. H. Gilbert Welch On Fri, Jan 31, 2014 at 1:36 PM, chris warth cswa...@gmail.com wrote: Can anyone suggest a rationale for why sapply() returns different types (list and matrix) in the two examples below? Is there any way to get sapply() or any other apply() function to return a matrix in both cases? simplify=TRUE doesn't change the outcome. I understand why it is happening, I just can't understand why such unpredictable behavior makes sense. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sapply returning list instead of matrix
Pot, meet kettle. You claim to be able to read documentation, yet you don't reference knowledge gained or clarity lost from such activity in your question. I think this is a case of inertia of history that we all have to live with at this point. If you thoroughly read the documentation for ?sapply you will encounter the vapply function, which will provide the reliability you want at the cost of some additional syntactic complexity. Or not. I rarely use apply functions for arrays... if I can't vectorize my calculation, I preallocate my result array and use a for loop to fill it up. I don't have this problem with ddply. BTW: Gmail is capable of sending plain text... but you might have to read some documentation to find out how. --- Jeff NewmillerThe . . Go Live... DCN:jdnew...@dcn.davis.ca.usBasics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. On January 31, 2014 2:22:00 PM PST, chris warth cswa...@gmail.com wrote: Hey thanks for the helpful snark, Bert. To everyone else, I apologize for neglecting to actually include the examples. a - function(i) { list(1) } b - function(i) { list(1,2) } ll - sapply(seq(3), a, simplfy=list) mm - sapply(seq(3), b) class(ll) class(mm) class(ll) [1] list class(mm) [1] matrix I can read the documentation, I see why it happens, but who in their right mind would design a function this way? Can you imagine how many bugs are lurking because people haven't yet hit the right set of input that is going to cause sapply() to return a list instead of a matrix(). The point is that having the type of return value depend on the length of output from the applied function is simply madness. It is a terrible design decision. What is to be gained from the fact that I have to test the type of value returned from sapply()? I was hoping plyr::laply() would be better but it perpetuates the same bad interface. [so sorry for sending html, if that is what's happening. I guess gmail send html by default? ] On Fri, Jan 31, 2014 at 1:44 PM, Bert Gunter gunter.ber...@gene.com wrote: As you ignored the posting guide and posted in HTML, your below didn't get through. So one can only guess that it has something to do with (see ?sapply) Simplification in sapply is only attempted if X has length greater than zero and if the return values from all elements of X are all of the same (positive) length. If the common length is one the result is a vector, and if greater than one is a matrix with a column corresponding to each element of X. Return values most also be of the same type, also, obviously. Cheers, Bert Bert Gunter Genentech Nonclinical Biostatistics (650) 467-7374 Data is not information. Information is not knowledge. And knowledge is certainly not wisdom. H. Gilbert Welch On Fri, Jan 31, 2014 at 1:36 PM, chris warth cswa...@gmail.com wrote: Can anyone suggest a rationale for why sapply() returns different types (list and matrix) in the two examples below? Is there any way to get sapply() or any other apply() function to return a matrix in both cases? simplify=TRUE doesn't change the outcome. I understand why it is happening, I just can't understand why such unpredictable behavior makes sense. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.