Re: [R] sapply returning list instead of matrix

2014-02-03 Thread S Ellison
 I can read the documentation, I see why it happens, but who in their right
 mind would design a function this way?  
I think you're possibly starting from the wrong perspective, or at least it 
might be useful to look at it from a different perspective.

In many cases, such as simulations, lapply returns a list of identical-length 
vectors that, for subsequent purposes, would be more convenient if simplified 
to a vector or matrix, and that's an extra step or two. sapply is the answer to 
wouldn't it be nice if lapply simplified things for me if it were possible?

Now, if your function does something unexpected and returns uneven lengths, 
that's actually easier to catch if the return type changes (consider: a 
function expected to return a length 5 vector could return a length one NA for 
some input, probably with warning; that would cause the current sapply to 
return a list and subsequent statements expecting a matrix or vector would 
grind to a halt. This makes it quite hard for bugs to go undetected.
Forcing sapply to pad to the same length to guarantee an array would hide that, 
your script would continue to run and you'd be none the wiser until much later. 
Bugs could _more_ easily get into production code.

And of course, it is pretty much trivial to test for the correct type on 
return, using is.list etc, so it's a readily trappable behaviour as long as you 
plan for it.

S






***
This email and any attachments are confidential. Any use...{{dropped:8}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] sapply returning list instead of matrix

2014-02-02 Thread chris warth
Can I follow-up with what I've learned about my own myopia regarding
sapply()?

First, I appreciate all the feedback.   After thinking about it for a
while I realized R designers have often chosen to accommodate
interactive usage,  and in that context, sapply() returning different
types makes perfect sense.

If applying both 'mean' and 'var' to multiple data sets in a list, it
makes sense to return a matrix, but if applying just 'mean' the same
list of data sets it makes sense to return a list, not a 1xN matrix.
   This works well in an interactive context but when writing robust
applications, it is essential that routines return consistent types,
especially if the parameters are determined from unpredictable user
input.   The behavior of functions like sapply() in R seems
extraordinary compared to languages I am more familiar with like C,
Java, or Python.

In my case I was using sapply() to extract alignments from multiple
BAM files that overlap exons of a gene.My application of sapply()
returned a matrix with data sets across columns and exons down the
rows.   This worked well for most genes, but failed when run on a gene
with only a single exon because sapply() returned a list instead of a
matrix.   This bug in my code was just waiting for the right set of
inputs to trigger it.

[ Some suggested using vapply() but don't think that would help in
this case because the length of the return value from the applied
function is variable and depends on how many exons are in the gene.
Or perhaps I just don't understand vapply well. ]

sapply() is behaving very similarly to the way the '[' and '[['
operators treat data frames.   The extract operator '[' returns a
vector when extracting a single column from a data frame,  otherwise
it returns a data frame.However both '[' and '[[' take a 'drop'
parameter to control this behavior so you can get a consistent type
back if you need it.

I wish sapply() had a similar option.

-csw

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] sapply returning list instead of matrix

2014-01-31 Thread chris warth
Can anyone suggest a rationale for why sapply() returns different types
(list and matrix) in the two examples below?   Is there any way to get
sapply() or any other apply() function to return a matrix in both cases?
simplify=TRUE doesn't change the outcome.

I understand why it is happening, I just can't understand why such
unpredictable behavior makes sense.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] sapply returning list instead of matrix

2014-01-31 Thread Bert Gunter
As you ignored the posting guide and posted in HTML, your below
didn't get through. So one can only guess that it has something to do
with (see ?sapply)

Simplification in sapply is only attempted if X has length greater
than zero and if the return values from all elements of X are all of
the same (positive) length. If the common length is one the result is
a vector, and if greater than one is a matrix with a column
corresponding to each element of X. 

Return values most also be of the same type, also, obviously.

Cheers,
Bert

Bert Gunter
Genentech Nonclinical Biostatistics
(650) 467-7374

Data is not information. Information is not knowledge. And knowledge
is certainly not wisdom.
H. Gilbert Welch




On Fri, Jan 31, 2014 at 1:36 PM, chris warth cswa...@gmail.com wrote:
 Can anyone suggest a rationale for why sapply() returns different types
 (list and matrix) in the two examples below?   Is there any way to get
 sapply() or any other apply() function to return a matrix in both cases?
 simplify=TRUE doesn't change the outcome.

 I understand why it is happening, I just can't understand why such
 unpredictable behavior makes sense.

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] sapply returning list instead of matrix

2014-01-31 Thread chris warth
Hey thanks for the helpful snark, Bert.
To everyone else, I apologize for neglecting to actually include the
examples.

a - function(i) { list(1) }
b - function(i) { list(1,2) }
ll - sapply(seq(3), a, simplfy=list)
mm - sapply(seq(3), b)
class(ll)
class(mm)
 class(ll)
[1] list
 class(mm)
[1] matrix

I can read the documentation, I see why it happens, but who in their right
mind would design a function this way?  Can you imagine how many bugs are
lurking because people haven't yet hit the right set of input that is going
to cause sapply() to return a list instead of a matrix().

The point is that having the type of return value depend on the length of
output from the applied function is simply madness.   It is a terrible
design decision.  What is to be gained from the fact that I have to test
the type of value returned from sapply()?   I was hoping plyr::laply()
would be better but it perpetuates the same bad interface.

[so sorry for sending html, if that is what's happening.   I guess gmail
send html by default? ]


On Fri, Jan 31, 2014 at 1:44 PM, Bert Gunter gunter.ber...@gene.com wrote:

 As you ignored the posting guide and posted in HTML, your below
 didn't get through. So one can only guess that it has something to do
 with (see ?sapply)

 Simplification in sapply is only attempted if X has length greater
 than zero and if the return values from all elements of X are all of
 the same (positive) length. If the common length is one the result is
 a vector, and if greater than one is a matrix with a column
 corresponding to each element of X. 

 Return values most also be of the same type, also, obviously.

 Cheers,
 Bert

 Bert Gunter
 Genentech Nonclinical Biostatistics
 (650) 467-7374

 Data is not information. Information is not knowledge. And knowledge
 is certainly not wisdom.
 H. Gilbert Welch




 On Fri, Jan 31, 2014 at 1:36 PM, chris warth cswa...@gmail.com wrote:
  Can anyone suggest a rationale for why sapply() returns different types
  (list and matrix) in the two examples below?   Is there any way to get
  sapply() or any other apply() function to return a matrix in both cases?
  simplify=TRUE doesn't change the outcome.
 
  I understand why it is happening, I just can't understand why such
  unpredictable behavior makes sense.
 
  [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] sapply returning list instead of matrix

2014-01-31 Thread William Dunlap
 I can read the documentation, I see why it happens, but who in their right
 mind would design a function this way?  Can you imagine how many bugs are
 lurking because people haven't yet hit the right set of input that is going
 to cause sapply() to return a list instead of a matrix().

If you always want a list output use lapply().  If you want the simplification
that sapply does, but with sanity checks, use vapply().

vapply() lets you assert the type and size of FUN's return value.  If all goes
well it returns what sapply() would return but it throws an error if any call
to FUN returns something unexpected.  (Also, if length(X) is 0, vapply
makes the output be a zero-length object of the appropriate type.)
   vapply(1:3, FUN=seq_along, FUN.VALUE=1L)
  [1] 1 1 1
   vapply(1:3, FUN=range, FUN.VALUE=c(0,0))
   [,1] [,2] [,3]
  [1,]123
  [2,]123
   vapply(1:3, FUN=seq, FUN.VALUE=1L)
  Error in vapply(1:3, FUN = seq, FUN.VALUE = 1L) :
values must be length 1,
   but FUN(X[[2]]) result is length 2
   vapply(numeric(0), FUN=range, FUN.VALUE=c(0,0)) # returns 2 by 0 numeric 
matrix
  
  [1,]
  [2,]

Bill Dunlap
TIBCO Software
wdunlap tibco.com


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
 Behalf
 Of chris warth
 Sent: Friday, January 31, 2014 2:22 PM
 To: r-help@r-project.org
 Subject: Re: [R] sapply returning list instead of matrix
 
 Hey thanks for the helpful snark, Bert.
 To everyone else, I apologize for neglecting to actually include the
 examples.
 
 a - function(i) { list(1) }
 b - function(i) { list(1,2) }
 ll - sapply(seq(3), a, simplfy=list)
 mm - sapply(seq(3), b)
 class(ll)
 class(mm)
  class(ll)
 [1] list
  class(mm)
 [1] matrix
 
 I can read the documentation, I see why it happens, but who in their right
 mind would design a function this way?  Can you imagine how many bugs are
 lurking because people haven't yet hit the right set of input that is going
 to cause sapply() to return a list instead of a matrix().
 
 The point is that having the type of return value depend on the length of
 output from the applied function is simply madness.   It is a terrible
 design decision.  What is to be gained from the fact that I have to test
 the type of value returned from sapply()?   I was hoping plyr::laply()
 would be better but it perpetuates the same bad interface.
 
 [so sorry for sending html, if that is what's happening.   I guess gmail
 send html by default? ]
 
 
 On Fri, Jan 31, 2014 at 1:44 PM, Bert Gunter gunter.ber...@gene.com wrote:
 
  As you ignored the posting guide and posted in HTML, your below
  didn't get through. So one can only guess that it has something to do
  with (see ?sapply)
 
  Simplification in sapply is only attempted if X has length greater
  than zero and if the return values from all elements of X are all of
  the same (positive) length. If the common length is one the result is
  a vector, and if greater than one is a matrix with a column
  corresponding to each element of X. 
 
  Return values most also be of the same type, also, obviously.
 
  Cheers,
  Bert
 
  Bert Gunter
  Genentech Nonclinical Biostatistics
  (650) 467-7374
 
  Data is not information. Information is not knowledge. And knowledge
  is certainly not wisdom.
  H. Gilbert Welch
 
 
 
 
  On Fri, Jan 31, 2014 at 1:36 PM, chris warth cswa...@gmail.com wrote:
   Can anyone suggest a rationale for why sapply() returns different types
   (list and matrix) in the two examples below?   Is there any way to get
   sapply() or any other apply() function to return a matrix in both cases?
   simplify=TRUE doesn't change the outcome.
  
   I understand why it is happening, I just can't understand why such
   unpredictable behavior makes sense.
  
   [[alternative HTML version deleted]]
  
   __
   R-help@r-project.org mailing list
   https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
   and provide commented, minimal, self-contained, reproducible code.
 
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] sapply returning list instead of matrix

2014-01-31 Thread Jeff Newmiller
Pot, meet kettle. You claim to be able to read documentation, yet you don't 
reference knowledge gained or clarity lost from such activity in your question.

I think this is a case of inertia of history that we all have to live with at 
this point. If you thoroughly read the documentation for ?sapply you will 
encounter the vapply function, which will provide the reliability you want at 
the cost of some additional syntactic complexity.

Or not. I rarely use apply functions for arrays... if I can't vectorize my 
calculation, I preallocate my result array and use a for loop to fill it up. I 
don't have this problem with ddply.

BTW: Gmail is capable of sending plain text... but you might have to read some 
documentation to find out how.
---
Jeff NewmillerThe .   .  Go Live...
DCN:jdnew...@dcn.davis.ca.usBasics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

On January 31, 2014 2:22:00 PM PST, chris warth cswa...@gmail.com wrote:
Hey thanks for the helpful snark, Bert.
To everyone else, I apologize for neglecting to actually include the
examples.

a - function(i) { list(1) }
b - function(i) { list(1,2) }
ll - sapply(seq(3), a, simplfy=list)
mm - sapply(seq(3), b)
class(ll)
class(mm)
 class(ll)
[1] list
 class(mm)
[1] matrix

I can read the documentation, I see why it happens, but who in their
right
mind would design a function this way?  Can you imagine how many bugs
are
lurking because people haven't yet hit the right set of input that is
going
to cause sapply() to return a list instead of a matrix().

The point is that having the type of return value depend on the length
of
output from the applied function is simply madness.   It is a terrible
design decision.  What is to be gained from the fact that I have to
test
the type of value returned from sapply()?   I was hoping plyr::laply()
would be better but it perpetuates the same bad interface.

[so sorry for sending html, if that is what's happening.   I guess
gmail
send html by default? ]


On Fri, Jan 31, 2014 at 1:44 PM, Bert Gunter gunter.ber...@gene.com
wrote:

 As you ignored the posting guide and posted in HTML, your below
 didn't get through. So one can only guess that it has something to do
 with (see ?sapply)

 Simplification in sapply is only attempted if X has length greater
 than zero and if the return values from all elements of X are all of
 the same (positive) length. If the common length is one the result is
 a vector, and if greater than one is a matrix with a column
 corresponding to each element of X. 

 Return values most also be of the same type, also, obviously.

 Cheers,
 Bert

 Bert Gunter
 Genentech Nonclinical Biostatistics
 (650) 467-7374

 Data is not information. Information is not knowledge. And knowledge
 is certainly not wisdom.
 H. Gilbert Welch




 On Fri, Jan 31, 2014 at 1:36 PM, chris warth cswa...@gmail.com
wrote:
  Can anyone suggest a rationale for why sapply() returns different
types
  (list and matrix) in the two examples below?   Is there any way to
get
  sapply() or any other apply() function to return a matrix in both
cases?
  simplify=TRUE doesn't change the outcome.
 
  I understand why it is happening, I just can't understand why such
  unpredictable behavior makes sense.
 
  [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.


   [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.