Re: [R] sapply function and poisson distribution
On 05 Jan 2015, at 00:21 , Pete Brecknock peter.breckn...@bp.com wrote: n - c(1,2,3,4,5) lambda - c(0.1,0.8,1.2,2.2,4.2) mapply(function(x,y) rpois(x,y), n, lambda) Yes. I'd throw in a SIMPLIFY=FALSE to avoid getting results in a different format if n is constant (then again, sapply() in the original question is sort of asking for that kind of trouble...). An alternative is to use the fact that rpois() vectorizes on the lambda argument: ll - rep(lambda, n) g - rep(seq_along(lambda), n) N - sum(n) split(rpois(N, ll), g) which can of course equally well be wrapped in a function as Pete's solution can. -- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd@cbs.dk Priv: pda...@gmail.com __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sapply function and poisson distribution
thank you for your answer.Yes,that sounds right.I thought the same thing but the problem is how can i generalize the command for every vector of numbers not only for the specific example?not only for c(1,2),c(0.1,0.8). 2015-01-04 0:45 GMT+00:00 Pete Brecknock [via R] ml-node+s789695n4701358...@n4.nabble.com: dimnik wrote i want to find a function that takes in two vectors of numbers that have the same length.The output should be a list of vectors, where each vector is a sequence of randomly generated Poisson variables where the number of samples in each vector is determined by the entries in the first input vector and the lambdas come from the entries in the second input vector. For example, :If the inputs are c(1,2) and c(0.1,0.8) the output will be a list of twovectors where the first vectorhas a single sample from Poisson(0.1) and the second vector has two samples from Poisson(0.8).How can i do all that kind of stuff using sapply function? thank u in advance How about using mapply, the multivariate version of sapply? Based on your example ... mapply(function(x,y) rpois(x,y), c(1,2),c(0.1,0.8)) HTH Pete -- If you reply to this email, your message will be added to the discussion below: http://r.789695.n4.nabble.com/sapply-function-and-poisson-distribution-tp4701353p4701358.html To unsubscribe from sapply function and poisson distribution, click here http://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=4701353code=dmFnZWxpc2d1ZEBnbWFpbC5jb218NDcwMTM1M3wtMTg5MDAyODgzMA== . NAML http://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml -- View this message in context: http://r.789695.n4.nabble.com/sapply-function-and-poisson-distribution-tp4701353p4701373.html Sent from the R help mailing list archive at Nabble.com. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sapply function and poisson distribution
dimnik wrote thank you for your answer.Yes,that sounds right.I thought the same thing but the problem is how can i generalize the command for every vector of numbers not only for the specific example?not only for c(1,2),c(0.1,0.8). 2015-01-04 0:45 GMT+00:00 Pete Brecknock [via R] ml-node+s789695n4701358h57@.nabble : dimnik wrote i want to find a function that takes in two vectors of numbers that have the same length.The output should be a list of vectors, where each vector is a sequence of randomly generated Poisson variables where the number of samples in each vector is determined by the entries in the first input vector and the lambdas come from the entries in the second input vector. For example, :If the inputs are c(1,2) and c(0.1,0.8) the output will be a list of twovectors where the first vectorhas a single sample from Poisson(0.1) and the second vector has two samples from Poisson(0.8).How can i do all that kind of stuff using sapply function? thank u in advance How about using mapply, the multivariate version of sapply? Based on your example ... mapply(function(x,y) rpois(x,y), c(1,2),c(0.1,0.8)) HTH Pete -- If you reply to this email, your message will be added to the discussion below: http://r.789695.n4.nabble.com/sapply-function-and-poisson-distribution-tp4701353p4701358.html To unsubscribe from sapply function and poisson distribution, click here lt;http://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codeamp;node=4701353amp;code=dmFnZWxpc2d1ZEBnbWFpbC5jb218NDcwMTM1M3wtMTg5MDAyODgzMA==gt; . NAML lt;http://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=macro_vieweramp;id=instant_html%21nabble%3Aemail.namlamp;base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespaceamp;breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.namlgt; Not sure how you intend to specify the input vectors for n and lambda One way would be as below - you can amend the 2 vectors with the values of your choice. n - c(1,2,3,4,5) lambda - c(0.1,0.8,1.2,2.2,4.2) mapply(function(x,y) rpois(x,y), n, lambda) HTH Pete -- View this message in context: http://r.789695.n4.nabble.com/sapply-function-and-poisson-distribution-tp4701353p4701384.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sapply function and poisson distribution
dimnik wrote i want to find a functionthattakes in two vectors of numbers thathave the same length.The output should be a listof vectors, where each vector is a sequence of randomly generated Poisson variableswhere the number of samples in each vector is determined by the entries in the first input vector and the lambdas come from the entries in the second input vector. For example, :If the inputs are c(1,2)and c(0.1,0.8) the output will be a list of twovectors where the first vectorhas a single sample fromPoisson(0.1) andthe second vector hastwo samples from Poisson(0.8).How can i do all that kind of stuff using sapply function? thank u in advance How about using mapply, the multivariate version of sapply? Based on your example ... mapply(function(x,y) rpois(x,y), c(1,2),c(0.1,0.8)) HTH Pete -- View this message in context: http://r.789695.n4.nabble.com/sapply-function-and-poisson-distribution-tp4701353p4701358.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sapply returning list instead of matrix
I can read the documentation, I see why it happens, but who in their right mind would design a function this way? I think you're possibly starting from the wrong perspective, or at least it might be useful to look at it from a different perspective. In many cases, such as simulations, lapply returns a list of identical-length vectors that, for subsequent purposes, would be more convenient if simplified to a vector or matrix, and that's an extra step or two. sapply is the answer to wouldn't it be nice if lapply simplified things for me if it were possible? Now, if your function does something unexpected and returns uneven lengths, that's actually easier to catch if the return type changes (consider: a function expected to return a length 5 vector could return a length one NA for some input, probably with warning; that would cause the current sapply to return a list and subsequent statements expecting a matrix or vector would grind to a halt. This makes it quite hard for bugs to go undetected. Forcing sapply to pad to the same length to guarantee an array would hide that, your script would continue to run and you'd be none the wiser until much later. Bugs could _more_ easily get into production code. And of course, it is pretty much trivial to test for the correct type on return, using is.list etc, so it's a readily trappable behaviour as long as you plan for it. S *** This email and any attachments are confidential. Any use...{{dropped:8}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sapply returning list instead of matrix
Can I follow-up with what I've learned about my own myopia regarding sapply()? First, I appreciate all the feedback. After thinking about it for a while I realized R designers have often chosen to accommodate interactive usage, and in that context, sapply() returning different types makes perfect sense. If applying both 'mean' and 'var' to multiple data sets in a list, it makes sense to return a matrix, but if applying just 'mean' the same list of data sets it makes sense to return a list, not a 1xN matrix. This works well in an interactive context but when writing robust applications, it is essential that routines return consistent types, especially if the parameters are determined from unpredictable user input. The behavior of functions like sapply() in R seems extraordinary compared to languages I am more familiar with like C, Java, or Python. In my case I was using sapply() to extract alignments from multiple BAM files that overlap exons of a gene.My application of sapply() returned a matrix with data sets across columns and exons down the rows. This worked well for most genes, but failed when run on a gene with only a single exon because sapply() returned a list instead of a matrix. This bug in my code was just waiting for the right set of inputs to trigger it. [ Some suggested using vapply() but don't think that would help in this case because the length of the return value from the applied function is variable and depends on how many exons are in the gene. Or perhaps I just don't understand vapply well. ] sapply() is behaving very similarly to the way the '[' and '[[' operators treat data frames. The extract operator '[' returns a vector when extracting a single column from a data frame, otherwise it returns a data frame.However both '[' and '[[' take a 'drop' parameter to control this behavior so you can get a consistent type back if you need it. I wish sapply() had a similar option. -csw __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sapply returning list instead of matrix
As you ignored the posting guide and posted in HTML, your below didn't get through. So one can only guess that it has something to do with (see ?sapply) Simplification in sapply is only attempted if X has length greater than zero and if the return values from all elements of X are all of the same (positive) length. If the common length is one the result is a vector, and if greater than one is a matrix with a column corresponding to each element of X. Return values most also be of the same type, also, obviously. Cheers, Bert Bert Gunter Genentech Nonclinical Biostatistics (650) 467-7374 Data is not information. Information is not knowledge. And knowledge is certainly not wisdom. H. Gilbert Welch On Fri, Jan 31, 2014 at 1:36 PM, chris warth cswa...@gmail.com wrote: Can anyone suggest a rationale for why sapply() returns different types (list and matrix) in the two examples below? Is there any way to get sapply() or any other apply() function to return a matrix in both cases? simplify=TRUE doesn't change the outcome. I understand why it is happening, I just can't understand why such unpredictable behavior makes sense. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sapply returning list instead of matrix
Hey thanks for the helpful snark, Bert. To everyone else, I apologize for neglecting to actually include the examples. a - function(i) { list(1) } b - function(i) { list(1,2) } ll - sapply(seq(3), a, simplfy=list) mm - sapply(seq(3), b) class(ll) class(mm) class(ll) [1] list class(mm) [1] matrix I can read the documentation, I see why it happens, but who in their right mind would design a function this way? Can you imagine how many bugs are lurking because people haven't yet hit the right set of input that is going to cause sapply() to return a list instead of a matrix(). The point is that having the type of return value depend on the length of output from the applied function is simply madness. It is a terrible design decision. What is to be gained from the fact that I have to test the type of value returned from sapply()? I was hoping plyr::laply() would be better but it perpetuates the same bad interface. [so sorry for sending html, if that is what's happening. I guess gmail send html by default? ] On Fri, Jan 31, 2014 at 1:44 PM, Bert Gunter gunter.ber...@gene.com wrote: As you ignored the posting guide and posted in HTML, your below didn't get through. So one can only guess that it has something to do with (see ?sapply) Simplification in sapply is only attempted if X has length greater than zero and if the return values from all elements of X are all of the same (positive) length. If the common length is one the result is a vector, and if greater than one is a matrix with a column corresponding to each element of X. Return values most also be of the same type, also, obviously. Cheers, Bert Bert Gunter Genentech Nonclinical Biostatistics (650) 467-7374 Data is not information. Information is not knowledge. And knowledge is certainly not wisdom. H. Gilbert Welch On Fri, Jan 31, 2014 at 1:36 PM, chris warth cswa...@gmail.com wrote: Can anyone suggest a rationale for why sapply() returns different types (list and matrix) in the two examples below? Is there any way to get sapply() or any other apply() function to return a matrix in both cases? simplify=TRUE doesn't change the outcome. I understand why it is happening, I just can't understand why such unpredictable behavior makes sense. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sapply returning list instead of matrix
I can read the documentation, I see why it happens, but who in their right mind would design a function this way? Can you imagine how many bugs are lurking because people haven't yet hit the right set of input that is going to cause sapply() to return a list instead of a matrix(). If you always want a list output use lapply(). If you want the simplification that sapply does, but with sanity checks, use vapply(). vapply() lets you assert the type and size of FUN's return value. If all goes well it returns what sapply() would return but it throws an error if any call to FUN returns something unexpected. (Also, if length(X) is 0, vapply makes the output be a zero-length object of the appropriate type.) vapply(1:3, FUN=seq_along, FUN.VALUE=1L) [1] 1 1 1 vapply(1:3, FUN=range, FUN.VALUE=c(0,0)) [,1] [,2] [,3] [1,]123 [2,]123 vapply(1:3, FUN=seq, FUN.VALUE=1L) Error in vapply(1:3, FUN = seq, FUN.VALUE = 1L) : values must be length 1, but FUN(X[[2]]) result is length 2 vapply(numeric(0), FUN=range, FUN.VALUE=c(0,0)) # returns 2 by 0 numeric matrix [1,] [2,] Bill Dunlap TIBCO Software wdunlap tibco.com -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of chris warth Sent: Friday, January 31, 2014 2:22 PM To: r-help@r-project.org Subject: Re: [R] sapply returning list instead of matrix Hey thanks for the helpful snark, Bert. To everyone else, I apologize for neglecting to actually include the examples. a - function(i) { list(1) } b - function(i) { list(1,2) } ll - sapply(seq(3), a, simplfy=list) mm - sapply(seq(3), b) class(ll) class(mm) class(ll) [1] list class(mm) [1] matrix I can read the documentation, I see why it happens, but who in their right mind would design a function this way? Can you imagine how many bugs are lurking because people haven't yet hit the right set of input that is going to cause sapply() to return a list instead of a matrix(). The point is that having the type of return value depend on the length of output from the applied function is simply madness. It is a terrible design decision. What is to be gained from the fact that I have to test the type of value returned from sapply()? I was hoping plyr::laply() would be better but it perpetuates the same bad interface. [so sorry for sending html, if that is what's happening. I guess gmail send html by default? ] On Fri, Jan 31, 2014 at 1:44 PM, Bert Gunter gunter.ber...@gene.com wrote: As you ignored the posting guide and posted in HTML, your below didn't get through. So one can only guess that it has something to do with (see ?sapply) Simplification in sapply is only attempted if X has length greater than zero and if the return values from all elements of X are all of the same (positive) length. If the common length is one the result is a vector, and if greater than one is a matrix with a column corresponding to each element of X. Return values most also be of the same type, also, obviously. Cheers, Bert Bert Gunter Genentech Nonclinical Biostatistics (650) 467-7374 Data is not information. Information is not knowledge. And knowledge is certainly not wisdom. H. Gilbert Welch On Fri, Jan 31, 2014 at 1:36 PM, chris warth cswa...@gmail.com wrote: Can anyone suggest a rationale for why sapply() returns different types (list and matrix) in the two examples below? Is there any way to get sapply() or any other apply() function to return a matrix in both cases? simplify=TRUE doesn't change the outcome. I understand why it is happening, I just can't understand why such unpredictable behavior makes sense. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sapply returning list instead of matrix
Pot, meet kettle. You claim to be able to read documentation, yet you don't reference knowledge gained or clarity lost from such activity in your question. I think this is a case of inertia of history that we all have to live with at this point. If you thoroughly read the documentation for ?sapply you will encounter the vapply function, which will provide the reliability you want at the cost of some additional syntactic complexity. Or not. I rarely use apply functions for arrays... if I can't vectorize my calculation, I preallocate my result array and use a for loop to fill it up. I don't have this problem with ddply. BTW: Gmail is capable of sending plain text... but you might have to read some documentation to find out how. --- Jeff NewmillerThe . . Go Live... DCN:jdnew...@dcn.davis.ca.usBasics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. On January 31, 2014 2:22:00 PM PST, chris warth cswa...@gmail.com wrote: Hey thanks for the helpful snark, Bert. To everyone else, I apologize for neglecting to actually include the examples. a - function(i) { list(1) } b - function(i) { list(1,2) } ll - sapply(seq(3), a, simplfy=list) mm - sapply(seq(3), b) class(ll) class(mm) class(ll) [1] list class(mm) [1] matrix I can read the documentation, I see why it happens, but who in their right mind would design a function this way? Can you imagine how many bugs are lurking because people haven't yet hit the right set of input that is going to cause sapply() to return a list instead of a matrix(). The point is that having the type of return value depend on the length of output from the applied function is simply madness. It is a terrible design decision. What is to be gained from the fact that I have to test the type of value returned from sapply()? I was hoping plyr::laply() would be better but it perpetuates the same bad interface. [so sorry for sending html, if that is what's happening. I guess gmail send html by default? ] On Fri, Jan 31, 2014 at 1:44 PM, Bert Gunter gunter.ber...@gene.com wrote: As you ignored the posting guide and posted in HTML, your below didn't get through. So one can only guess that it has something to do with (see ?sapply) Simplification in sapply is only attempted if X has length greater than zero and if the return values from all elements of X are all of the same (positive) length. If the common length is one the result is a vector, and if greater than one is a matrix with a column corresponding to each element of X. Return values most also be of the same type, also, obviously. Cheers, Bert Bert Gunter Genentech Nonclinical Biostatistics (650) 467-7374 Data is not information. Information is not knowledge. And knowledge is certainly not wisdom. H. Gilbert Welch On Fri, Jan 31, 2014 at 1:36 PM, chris warth cswa...@gmail.com wrote: Can anyone suggest a rationale for why sapply() returns different types (list and matrix) in the two examples below? Is there any way to get sapply() or any other apply() function to return a matrix in both cases? simplify=TRUE doesn't change the outcome. I understand why it is happening, I just can't understand why such unpredictable behavior makes sense. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SAPPLY function for COLUMN NULL
colnames(dd) #[1] col1 colb null_vector- colnames(dd) sapply(null_vector,makeNull,dd) # col1 colb #[1,] NA 4 #[2,] 2 NA #[3,] 3 2 #[4,] 4 NA #[5,] 1 4 #[6,] NA 5 #[7,] 1 6 A.K. I am trying to make a column value in a dataframe = NA if there is a 0 or high value in that column. I need to do this process repeatedly, hence I have to define a function. Here is the code, that I am using and is not working. Please advise on where I am making an error. makeNull - function(col, data=dd) { is.na(data[[col]]) - data[[col]] ==0 is.na(data[[col]]) - data[[col]] 99 return(data[[col]]) } dd - data.frame(col1=c(0,2,3,4,1,0,1),colb=c(4,0,2,0,4,5,6)) null_vector=c(cola,colb) sapply(null_vector,function(x) makeNull(x,dd)) Error in `[[-.data.frame`(`*tmp*`, col, value = logical(0)) : replacement has 0 rows, data has 7 Thank you in advance. -Sanjeev __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sapply error produced by grid search
On 16-05-2013, at 17:31, Patel, Shreena s.pate...@lancaster.ac.uk wrote: Dear R User, I'm trying to perform a grid-search for the ML estimator of the Box-Cox parameter for a linear mixed model. However using sapply to perform the grid search returns an error message. Here's a small example to demonstrate: library(lme4) # Function to fit model for a given lambda: bc.fit - function(lam,X,Z,Y){ ybar - exp(mean(log(Y))) if(lam==0){ w - ybar*log(Y) } else { w - (Y^lam-1)/(lam*ybar^(lam-1)) } bc.mod - lmer(w~X+(1|Z)) as.numeric(logLik(bc.mod)) } # Simulate data x - runif(1000) z - sample(1:100,1000,T) b - rnorm(100)[z] y - rnorm(1000,20+0.5*x+b,2) # Perform search lambda - 1:10/10 sapply(lambda,bc.fit,X=x,Z=z,Y=y) Produces the error: Error in get(as.character(FUN), mode = function, envir = envir) : object 'lambda' of mode 'function' was not found However a single run works fine - bc.fit(lambda[1],x,z,y) Other people appear to have had similar errors, caused by naming variables after existing functions, however I don't think that's the problem here. Any advice would be appreciated, thank you! It is the problem. The first argument of sapply is X. Your function bc.fit has an argument X which you explicitly set to x. So sapply takes lambda as the second argument which is FUN (function). So rename your bc.fit arguments to something else: e.g. (x,y,z) or (X1,Y1,Z1). And reread the documentation of sapply. It's all there. Berend __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sapply and matrix command
On Wed, Aug 8, 2012 at 10:37 AM, alijk1989 [via R] ml-node+s789695n4639622...@n4.nabble.com wrote: Hi Michael, Thanks for your response. Here is a simple example of what I am trying to do: w=rep(0.02,10) Q=rep(0.02,10) rho=matrix(0.5,nrow=10,ncol=10) m=10 LGD=0.45 M1=sum(sapply(1:m, function(k){sum(sapply(1:m,function(j){w[k]*w[j]*LGD^2*(pmnorm(c(qnorm(Q[k]),qnorm(Q[j])),c(0,0),equicorr(2,rho[k,j]))-Q[k]*Q[j])}))})) It uses the mnormt and QRM packages. I am trying to reproduce the following sum: \sum\limits_{j=1}^M \sum\limits_{k=1}^M w_j w_k LGD^2[N_2(N^{-1}[Q(j)],N^{-1}[Q(k)],\rho_{jk})-Q(j)Q(k)] Hi Alijk, The problem ultimately comes from the fact that equicorr() is not nicely vectorized so you'll need to home-brew something to replace it, but that's not hard. However, I don't think that pmnorm is vectorized in the varcov argument: we can probably work out the double sapply() loop if you can assume rho is constant, otherwise I think you're stuck. Michael ___ If you reply to this email, your message will be added to the discussion below: http://r.789695.n4.nabble.com/sapply-and-matrix-command-tp4637769p4639622.html To unsubscribe from sapply and matrix command, visit http://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=4637769code=ci1oZWxwQHItcHJvamVjdC5vcmd8NDYzNzc2OXwtNzg0MjM1NTA4 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sapply and matrix command
Hi, I have made some progress speeding up my code. This is what I have at the moment: M1=sum(sapply(1:m, function(k){sum(sapply(1:m,function(j){w[k]*w[j]*LGD^2 (pmnorm(c(qnorm(Q[k]),qnorm(Q[j])),c(0,0),equicorr(2,rho[k,j]))-Q[k]*Q[j])}))})) I tried setting up a function as so: f1 - function(k,j) {w[k]*w[j]*LGD^2*(pmnorm(c(qnorm(Q[k]),qnorm(Q[j])),c(0,0),equicorr(2,rho[k,j]))-Q[k]*Q[j])} Then run outer and sum as so: sum(outer(1:m,1:m,foo)) Unfortunately outer doesn't seem to like the equicorr or matrix commands like my problem above. Is there any way around this or am I stuck with the double sapply? Thanks again for your help! ___ If you reply to this email, your message will be added to the discussion below: http://r.789695.n4.nabble.com/sapply-and-matrix-command-tp4637769p4639595.html To unsubscribe from sapply and matrix command, visit http://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=4637769code=ci1oZWxwQHItcHJvamVjdC5vcmd8NDYzNzc2OXwtNzg0MjM1NTA4 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sapply and matrix command
On Wed, Aug 8, 2012 at 9:17 AM, alijk1989 [via R] ml-node+s789695n4639595...@n4.nabble.com wrote: Hi, I have made some progress speeding up my code. This is what I have at the moment: M1=sum(sapply(1:m, function(k){sum(sapply(1:m,function(j){w[k]*w[j]*LGD^2 (pmnorm(c(qnorm(Q[k]),qnorm(Q[j])),c(0,0),equicorr(2,rho[k,j]))-Q[k]*Q[j])}))})) I tried setting up a function as so: f1 - function(k,j) {w[k]*w[j]*LGD^2*(pmnorm(c(qnorm(Q[k]),qnorm(Q[j])),c(0,0),equicorr(2,rho[k,j]))-Q[k]*Q[j])} Then run outer and sum as so: sum(outer(1:m,1:m,foo)) Unfortunately outer doesn't seem to like the equicorr or matrix commands like my problem above. Is there any way around this or am I stuck with the double sapply? Thanks again for your help! I'm afraid we have no context for what you're trying to do -- the vast majority of us don't use Nabble. Could you enlighten us? Also, please do try to make your code more legible by breaking things up into multiple lines. Finally, try to make a reproducible example according to the hints herein: http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example Cheers, Michael __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sapply and matrix command
Hi Michael, Thanks for your response. Here is a simple example of what I am trying to do: w=rep(0.02,10) Q=rep(0.02,10) rho=matrix(0.5,nrow=10,ncol=10) m=10 LGD=0.45 M1=sum(sapply(1:m, function(k){sum(sapply(1:m,function(j){w[k]*w[j]*LGD^2*(pmnorm(c(qnorm(Q[k]),qnorm(Q[j])),c(0,0),equicorr(2,rho[k,j]))-Q[k]*Q[j])}))})) It uses the mnormt and QRM packages. I am trying to reproduce the following sum: \sum\limits_{j=1}^M \sum\limits_{k=1}^M w_j w_k LGD^2[N_2(N^{-1}[Q(j)],N^{-1}[Q(k)],\rho_{jk})-Q(j)Q(k)] ___ If you reply to this email, your message will be added to the discussion below: http://r.789695.n4.nabble.com/sapply-and-matrix-command-tp4637769p4639622.html To unsubscribe from sapply and matrix command, visit http://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=4637769code=ci1oZWxwQHItcHJvamVjdC5vcmd8NDYzNzc2OXwtNzg0MjM1NTA4 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sapply() and by()
Dominic, It's great that you provided some example data, but a much smaller data frame would have sufficed. For example, 10 randomly selected rows from your data ... LF - structure(list(Serra.da.Foladoira = c(27.335652173913, 25.4632608695652, 24.464652173913, 22.550652173913, 22.2177826086956, 29.3744782608695, 24.1317826086956, 25.5464782608695, 27.7517391304348, 25.172), Santiago = c(32.6199565217391, 27.9597826086956, 32.7863913043478, 25.2136086956521, 23.7573043478261, 32.6199565217391, 28.6671304347826, 27.9597826086956, 29.7489565217391, 23.5492608695652), Sergude = c(31.7877826086956, 27.4604782608695, 26.1706086956521, 25.8377391304348, 26.5034782608695, 33.2856956521739, 30.4979130434782, 30.7059565217391, 30.8307826086956, 31.9542173913043), Rio.Do.Sol = c(30.3730869565217, 25.7545217391304, 25.421652173913, 24.1317826086956, 23.4660434782608, 31.1220434782608, 25.8377391304348, 25.8793478260869, 30.7059565217391, 24.464652173913 ), V5 = c(10L, 2L, 2L, 11L, 3L, 8L, 8L, 3L, 8L, 6L)), .Names = c(Serra.da.Foladoira, Santiago, Sergude, Rio.Do.Sol, V5), row.names = c(1017L, 778L, 400L, 1403L, 86L, 1311L, 598L, 1536L, 605L, 520L), class = data.frame) Try this code to calculate the mean of each of the first four columns for each value of the fifth column ... aggregate(LF[, 1:4], list(month=LF$V5), mean) The sapply() approach doesn't have a built in by type of argument. Jean Dominic Roye dominic.r...@gmail.com wrote on 08/06/2012 09:34:58 AM: Hello everyone, I have a dataset with 5 colums (4 colums with thresholds of weather stations and one with month - data of 5 years). Now I would like to calculate the average for each month. I tried this unsuccessfully: lf.med - sapply(LF[,1:4],mean,LF[,5]) Error in mean.default(X[[1L]], ...) : 'trim' must be numeric and have length 1 With lf.med - by(LF[,1:4],LF[,5],mean) It works, but its deprecated. Any help is greatly appreciated!!! Thanky everybody`!! Dominic dput(LC) structure(list(Serra.da.Foladoira = c(21.1359565217391, 21.7184782608695, 23.5492608695652, 23.4660434782608, 23.6740869565217, 21.1775652173913, SNIPPED 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L)), row.names = c(NA, -1826L), .Names = c(Serra.da.Foladoira, Santiago, Sergude, Rio.Do.Sol, V5), class = data.frame) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sapply() and by()
On Aug 6, 2012, at 7:34 AM, Dominic Roye wrote: Hello everyone, I have a dataset with 5 colums (4 colums with thresholds of weather stations and one with month - data of 5 years). Now I would like to calculate the average for each month. I tried this unsuccessfully: lf.med - sapply(LF[,1:4],mean,) If you want to group calculations within categories then sapply is not the right function to turn to immediately. Use one of 'aggregate', 'tapply' or 'ave'. Error in mean.default(X[[1L]], ...) : 'trim' must be numeric and have length 1 It is telling you that the unnamed third argument was matched to the 'trim' parameter of the function 'mean'. Perhaps: aggregate( LF[,1:4], list(LF[,5]), mean) With lf.med - by(LF[,1:4],LF[,5],mean) It works, but its deprecated. Actually what is deprecated is the function `mean.data.frame`. Any help is greatly appreciated!!! Thanky everybody`!! Minimal example. PLEASE. Dominic dput(LC) Please do note that you offered an object 'LC' but you code referred to 'LF'. structure(list(Serra.da.Foladoira = c(21.1359565217391, 21.7184782608695, 23.5492608695652, 23.4660434782608, 23.6740869565217, 21.1775652173913, 19.8460869565217, 23.3412173913043, 22.8835217391304, 24.3398260869565, snipped 1800+ length vector [[alternative HTML version deleted]] and provide commented, minimal, self-contained, reproducible code. -- David Winsemius, MD Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sapply and matrix command
Thanks again for the help looks like this will be useful for what I'm doing. Is there any way to use combn to return combinations of values with themselves: e.g. combn(1:3,2) [,1] [,2] [,3] [,4] [,5] [,6] [1,]111 2 2 3 [2,]1 232 3 3 ___ If you reply to this email, your message will be added to the discussion below: http://r.789695.n4.nabble.com/sapply-and-matrix-command-tp4637769p4639190.html To unsubscribe from sapply and matrix command, visit http://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=4637769code=ci1oZWxwQHItcHJvamVjdC5vcmd8NDYzNzc2OXwtNzg0MjM1NTA4 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sapply and matrix command
Take a look at ?expand.grid Michael On Aug 4, 2012, at 5:03 PM, alijk1989 [via R] ml-node+s789695n463919...@n4.nabble.com wrote: Thanks again for the help looks like this will be useful for what I'm doing. Is there any way to use combn to return combinations of values with themselves: e.g. combn(1:3,2) [,1] [,2] [,3] [,4] [,5] [,6] [1,]111 2 2 3 [2,]1 232 3 3 ___ If you reply to this email, your message will be added to the discussion below: http://r.789695.n4.nabble.com/sapply-and-matrix-command-tp4637769p4639190.html To unsubscribe from sapply and matrix command, visit http://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=4637769code=ci1oZWxwQHItcHJvamVjdC5vcmd8NDYzNzc2OXwtNzg0MjM1NTA4 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sapply and matrix command
HI, You can also try this: d-1:25 A-sample(combn(20:30,2)) B-sample(combn(20:30,2)) lapply(d,function(x) matrix(c(1,A[x],B[x],1),2,2)) [[1]] [,1] [,2] [1,]1 23 [2,] 271 [[2]] [,1] [,2] [1,]1 21 [2,] 211 [[3]] [,1] [,2] [1,]1 29 [2,] 231 [[4]] [,1] [,2] [1,]1 25 [2,] 201 [[5]] [,1] [,2] [1,]1 23 [2,] 281 --- [[24]] [,1] [,2] [1,]1 24 [2,] 261 [[25]] [,1] [,2] [1,]1 22 [2,] 221 A.K. __ If you reply to this email, your message will be added to the discussion below: http://r.789695.n4.nabble.com/sapply-and-matrix-command-tp4637769p4638821.html This email was sent by arun kirshna (via Nabble) To receive all replies by email, subscribe to this discussion: http://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=subscribe_by_codenode=4637769code=ci1oZWxwQHItcHJvamVjdC5vcmd8NDYzNzc2OXwtNzg0MjM1NTA4 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sapply question
Go back and reread the section about the scoping of variables and that functions do not have side effects; they only return values. You are changinga local copy of df1 within the function which is returning the changed values to df3. On Wed, Jul 11, 2012 at 7:36 PM, Charles Stangor cstan...@charlesstangor.com wrote: Why does this sapply code change df3 but not df1? Thanks df1 - read.table(text= cola colb colc cold cole 1NA59 NA 17 2NA6 NA 14 NA 3 3NA 11 15 19 4 48 12 NA 20 , header=TRUE) df2 -df1*2 df1 df2 df3 -sapply(names(df1),function(x) {df1[[x]]- df2[[x]]}) df1 df3 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sapply help
Le vendredi 03 février 2012 à 18:51 +, William Dunlap a écrit : Instead of colSums(t(aMatrix)), why not the more direct rowSums(aMatrix)? Because I felt it was more didactic. The question was about counting occurrences per column, so using rowSums() could be a little confusing without an explanation. Of course, your solution is faster and cleaner. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sapply help
3-02-2012, 08:37 (-0800); Filoche escriu: Hi every one. I'm learning how to use sapply (and other function of this family). Here's what I'm trying to do. I have a vector of lets say 5 elements. I also have a matrix of nX5. I would like to know how many element by column are inferior to each element of my vector. On this example: v = c(1:5) M = matrix(3,2,5) I would like to have a vector at the end which give me 0 0 0 2 2 This does that: sapply(1:5, function(i) sum(M[,i] v[i])) [1] 0 0 0 2 2 Basically, it's like a loop where at each iteration the function is called with one element of the vector 1:5 as argument, so what this really does is sum(M[,1] v[1])) sum(M[,2] v[2])) ... and then the results are put all together in a vector. -- Cheers, Ernest __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sapply help
Le vendredi 03 février 2012 à 18:27 +0100, Ernest Adrogué a écrit : 3-02-2012, 08:37 (-0800); Filoche escriu: Hi every one. I'm learning how to use sapply (and other function of this family). Here's what I'm trying to do. I have a vector of lets say 5 elements. I also have a matrix of nX5. I would like to know how many element by column are inferior to each element of my vector. On this example: v = c(1:5) M = matrix(3,2,5) I would like to have a vector at the end which give me 0 0 0 2 2 This does that: sapply(1:5, function(i) sum(M[,i] v[i])) [1] 0 0 0 2 2 Basically, it's like a loop where at each iteration the function is called with one element of the vector 1:5 as argument, so what this really does is sum(M[,1] v[1])) sum(M[,2] v[2])) ... and then the results are put all together in a vector. Though in your case, I think there are shorter solutions. For example: colSums(t(apply(M, 1, , v))) [1] 0 0 0 2 2 apply() is more suited to matrices. Here, it takes each row separately, and compares it with v. Then, you can just sum the result to count the number of cases that fulfill the condition. Cheers __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sapply help
Instead of colSums(t(aMatrix)), why not the more direct rowSums(aMatrix)? If time is an issue (which it won't be unless the number of columns of M is big), compare: M - matrix(2e5:1, nrow=2) v - 1:ncol(M) system.time(z1 - sapply(seq_along(v), function(i) sum(M[,i] v[i]))) user system elapsed 0.532 0.000 0.532 system.time(z2 - colSums(t(apply(M, 1, , v user system elapsed 0.004 0.000 0.006 system.time(z3 - rowSums(apply(M, 1, , v))) user system elapsed 0.008 0.000 0.005 system.time(z4 - colSums(M matrix(v, nrow=nrow(M), ncol=ncol(M), byrow=TRUE))) user system elapsed 0.000 0.000 0.002 isTRUE(all.equal(z1, z2)) isTRUE(all.equal(z1,z3)) isTRUE(all.equal(z1,z4)) [1] TRUE Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Milan Bouchet- Valat Sent: Friday, February 03, 2012 10:17 AM To: Ernest Adrogué Cc: r-help@r-project.org Subject: Re: [R] sapply help Le vendredi 03 février 2012 à 18:27 +0100, Ernest Adrogué a écrit : 3-02-2012, 08:37 (-0800); Filoche escriu: Hi every one. I'm learning how to use sapply (and other function of this family). Here's what I'm trying to do. I have a vector of lets say 5 elements. I also have a matrix of nX5. I would like to know how many element by column are inferior to each element of my vector. On this example: v = c(1:5) M = matrix(3,2,5) I would like to have a vector at the end which give me 0 0 0 2 2 This does that: sapply(1:5, function(i) sum(M[,i] v[i])) [1] 0 0 0 2 2 Basically, it's like a loop where at each iteration the function is called with one element of the vector 1:5 as argument, so what this really does is sum(M[,1] v[1])) sum(M[,2] v[2])) ... and then the results are put all together in a vector. Though in your case, I think there are shorter solutions. For example: colSums(t(apply(M, 1, , v))) [1] 0 0 0 2 2 apply() is more suited to matrices. Here, it takes each row separately, and compares it with v. Then, you can just sum the result to count the number of cases that fulfill the condition. Cheers __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sapply help
Thank you sire. You explained it very well. This give ma a good point to start using sapply more frequently. Cordially, Phil -- View this message in context: http://r.789695.n4.nabble.com/sapply-help-tp4355092p4355376.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sapply Call Returning the condition has length 1 Error
You are right that the problem is that DummyFunc isn't vectorized. R looks for a single logical value in an if statement but x0 gives it a whole vector's worth -- as the warning indicates, it only uses the first and pushes the whole vector through the loop in the return(-x) branch, which explains the values you saw. The correct way to do it would be something like: ifelse(x 0, -x, x) If, as you suggest, you can't modify the function (for whatever reason), you can use the higher-order-function Vectorize() as follows: vDummyFunc - Vectorize(DummyFunc) vDummyFunc(-3:7) This isn't real vectorization, but it hides some *apply family stuff nicely. Note that this doesn't act as you might expect on Y since data.frames are taken column wise by default (you'll get the same problem). Michael On Tue, Dec 27, 2011 at 1:14 PM, Alex Zhang alex.zh...@ymail.com wrote: Dear all, Happy new year! I have a question re using sapply. Below is a dummy example that would replicate the error I saw. ##Code Starts here DummyFunc - function(x) { if (x 0) { return (x) } else { return (-x) } } Y = data.frame(val = c(-3:7)) sapply(Y, FUN = DummyFunc) ##Code ends here When I run it, I got: val [1,] 3 [2,] 2 [3,] 1 [4,] 0 [5,] -1 [6,] -2 [7,] -3 [8,] -4 [9,] -5 [10,] -6 [11,] -7 Warning message: In if (x 0) { : the condition has length 1 and only the first element will be used The result is different from what I would expect plus there is such an error message. I guess if the DummyFunc I provided is compatible with vectors, the problem would go away. But let's suppose I cannot change DummyFunc. Is there still a way to use sapply or alike without actually writing a loop? Thanks. - Alex [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sapply Call Returning the condition has length 1 Error
Dear Alex, -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of Alex Zhang Sent: December-27-11 2:14 PM To: r-help@r-project.org Subject: [R] sapply Call Returning the condition has length 1 Error Dear all, Happy new year! I have a question re using sapply. Below is a dummy example that would replicate the error I saw. ##Code Starts here DummyFunc - function(x) { if (x 0) { return (x) } else { return (-x) } } Y = data.frame(val = c(-3:7)) sapply(Y, FUN = DummyFunc) ##Code ends here When I run it, I got: val [1,] 3 [2,] 2 [3,] 1 [4,] 0 [5,] -1 [6,] -2 [7,] -3 [8,] -4 [9,] -5 [10,] -6 [11,] -7 Warning message: In if (x 0) { : the condition has length 1 and only the first element will be used The result is different from what I would expect plus there is such an error message. This is a warning, not really an error message. A data frame is essentially a list of variables (columns), and sapply() applies its FUN argument to each list element, that is, each variable -- the one variable val in your case. That produces a warning because val 0 is a vector of 11 elements, and the first comparison, 3 0, which is TRUE, controls the result. I guess if the DummyFunc I provided is compatible with vectors, the problem would go away. But let's suppose I cannot change DummyFunc. Is there still a way to use sapply or alike without actually writing a loop? Thanks. Well, you could just use abs(Y$val) [1] 3 2 1 0 1 2 3 4 5 6 7 but I suppose that you didn't really want to write your own version of the absolute-value function as something more than an exercise. An alternative is with(Y, ifelse(val 0, val, -val)) [1] 3 2 1 0 1 2 3 4 5 6 7 I hope this helps, John John Fox Senator William McMaster Professor of Social Statistics Department of Sociology McMaster University Hamilton, Ontario, Canada http://socserv.mcmaster.ca/jfox - Alex [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sapply Call Returning the condition has length 1 Error
John, Thanks for the pointers. The DummyFunc is just a made-up example. The true function I need to use is more complicated and would be distractive to include. Do you mean that sapply would take columns in the input data.frame and feed them into FUN as whole vectors? That explains the behavior. Is there an *apply function that will fee elements of the input data.frame into FUN instead of whole columns? Thanks. From: John Fox j...@mcmaster.ca To: 'Alex Zhang' alex.zh...@ymail.com Cc: r-help@r-project.org Sent: Tuesday, December 27, 2011 3:10 PM Subject: RE: [R] sapply Call Returning the condition has length 1 Error Dear Alex, -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of Alex Zhang Sent: December-27-11 2:14 PM To: r-help@r-project.org Subject: [R] sapply Call Returning the condition has length 1 Error Dear all, Happy new year! I have a question re using sapply. Below is a dummy example that would replicate the error I saw. ##Code Starts here DummyFunc - function(x) { if (x 0) { return (x) } else { return (-x) } } Y = data.frame(val = c(-3:7)) sapply(Y, FUN = DummyFunc) ##Code ends here When I run it, I got: val [1,] 3 [2,] 2 [3,] 1 [4,] 0 [5,] -1 [6,] -2 [7,] -3 [8,] -4 [9,] -5 [10,] -6 [11,] -7 Warning message: In if (x 0) { : the condition has length 1 and only the first element will be used The result is different from what I would expect plus there is such an error message. This is a warning, not really an error message. A data frame is essentially a list of variables (columns), and sapply() applies its FUN argument to each list element, that is, each variable -- the one variable val in your case. That produces a warning because val 0 is a vector of 11 elements, and the first comparison, 3 0, which is TRUE, controls the result. I guess if the DummyFunc I provided is compatible with vectors, the problem would go away. But let's suppose I cannot change DummyFunc. Is there still a way to use sapply or alike without actually writing a loop? Thanks. Well, you could just use abs(Y$val) [1] 3 2 1 0 1 2 3 4 5 6 7 but I suppose that you didn't really want to write your own version of the absolute-value function as something more than an exercise. An alternative is with(Y, ifelse(val 0, val, -val)) [1] 3 2 1 0 1 2 3 4 5 6 7 I hope this helps, John John Fox Senator William McMaster Professor of Social Statistics Department of Sociology McMaster University Hamilton, Ontario, Canada http://socserv.mcmaster.ca/jfox - Alex [[alternative HTML version deleted]] [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sapply Call Returning the condition has length 1 Error
Tell us what you want to do, not how you want to do it. What is the problem you are trying to solve? You can create your own function/code within the 'apply' to process one element of the vector as a time. What is the output that you expect? There is (almost) always a way of doing it, as long as we know what you want to do. On Tue, Dec 27, 2011 at 3:34 PM, Alex Zhang alex.zh...@ymail.com wrote: John, Thanks for the pointers. The DummyFunc is just a made-up example. The true function I need to use is more complicated and would be distractive to include. Do you mean that sapply would take columns in the input data.frame and feed them into FUN as whole vectors? That explains the behavior. Is there an *apply function that will fee elements of the input data.frame into FUN instead of whole columns? Thanks. From: John Fox j...@mcmaster.ca To: 'Alex Zhang' alex.zh...@ymail.com Cc: r-help@r-project.org Sent: Tuesday, December 27, 2011 3:10 PM Subject: RE: [R] sapply Call Returning the condition has length 1 Error Dear Alex, -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of Alex Zhang Sent: December-27-11 2:14 PM To: r-help@r-project.org Subject: [R] sapply Call Returning the condition has length 1 Error Dear all, Happy new year! I have a question re using sapply. Below is a dummy example that would replicate the error I saw. ##Code Starts here DummyFunc - function(x) { if (x 0) { return (x) } else { return (-x) } } Y = data.frame(val = c(-3:7)) sapply(Y, FUN = DummyFunc) ##Code ends here When I run it, I got: val [1,] 3 [2,] 2 [3,] 1 [4,] 0 [5,] -1 [6,] -2 [7,] -3 [8,] -4 [9,] -5 [10,] -6 [11,] -7 Warning message: In if (x 0) { : the condition has length 1 and only the first element will be used The result is different from what I would expect plus there is such an error message. This is a warning, not really an error message. A data frame is essentially a list of variables (columns), and sapply() applies its FUN argument to each list element, that is, each variable -- the one variable val in your case. That produces a warning because val 0 is a vector of 11 elements, and the first comparison, 3 0, which is TRUE, controls the result. I guess if the DummyFunc I provided is compatible with vectors, the problem would go away. But let's suppose I cannot change DummyFunc. Is there still a way to use sapply or alike without actually writing a loop? Thanks. Well, you could just use abs(Y$val) [1] 3 2 1 0 1 2 3 4 5 6 7 but I suppose that you didn't really want to write your own version of the absolute-value function as something more than an exercise. An alternative is with(Y, ifelse(val 0, val, -val)) [1] 3 2 1 0 1 2 3 4 5 6 7 I hope this helps, John John Fox Senator William McMaster Professor of Social Statistics Department of Sociology McMaster University Hamilton, Ontario, Canada http://socserv.mcmaster.ca/jfox - Alex [[alternative HTML version deleted]] [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sapply Call Returning the condition has length 1 Error
Dear Alex, -Original Message- From: Alex Zhang [mailto:alex.zh...@ymail.com] Sent: December-27-11 3:34 PM To: John Fox Cc: r-help@r-project.org Subject: Re: [R] sapply Call Returning the condition has length 1 Error John, Thanks for the pointers. The DummyFunc is just a made-up example. The true function I need to use is more complicated and would be distractive to include. You'll probably get a better answer if you don't keep what you want to do a secret. Do you mean that sapply would take columns in the input data.frame and feed them into FUN as whole vectors? That explains the behavior. Yes. As I said, a data frame is a list of columns, so FUN is called with each column as its argument. Is there an *apply function that will fee elements of the input data.frame into FUN instead of whole columns? Thanks. I'm afraid that I don't know what you mean. Do you want to deal with the columns of the data frame separately (in general, they need not all be of the same class), and within each column, apply a function separately to each element? You could nest calls to lapply() or sapply(), as in sapply(D, function(DD) sapply(DD, abs)) assuming, of course, that D is an entirely numeric data frame. But in this case, abs(as.matrix(D)) would be more sensible, and using sapply() like this isn't necessarily better than a loop. Again, not knowing what you want to do makes it hard to suggest a solution. Best, John From: John Fox j...@mcmaster.ca To: 'Alex Zhang' alex.zh...@ymail.com Cc: r-help@r-project.org Sent: Tuesday, December 27, 2011 3:10 PM Subject: RE: [R] sapply Call Returning the condition has length 1 Error Dear Alex, -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of Alex Zhang Sent: December-27-11 2:14 PM To: r-help@r-project.org Subject: [R] sapply Call Returning the condition has length 1 Error Dear all, Happy new year! I have a question re using sapply. Below is a dummy example that would replicate the error I saw. ##Code Starts here DummyFunc - function(x) { if (x 0) { return (x) } else { return (-x) } } Y = data.frame(val = c(-3:7)) sapply(Y, FUN = DummyFunc) ##Code ends here When I run it, I got: val [1,] 3 [2,] 2 [3,] 1 [4,] 0 [5,] -1 [6,] -2 [7,] -3 [8,] -4 [9,] -5 [10,] -6 [11,] -7 Warning message: In if (x 0) { : the condition has length 1 and only the first element will be used The result is different from what I would expect plus there is such an error message. This is a warning, not really an error message. A data frame is essentially a list of variables (columns), and sapply() applies its FUN argument to each list element, that is, each variable -- the one variable val in your case. That produces a warning because val 0 is a vector of 11 elements, and the first comparison, 3 0, which is TRUE, controls the result. I guess if the DummyFunc I provided is compatible with vectors, the problem would go away. But let's suppose I cannot change DummyFunc. Is there still a way to use sapply or alike without actually writing a loop? Thanks. Well, you could just use abs(Y$val) [1] 3 2 1 0 1 2 3 4 5 6 7 but I suppose that you didn't really want to write your own version of the absolute-value function as something more than an exercise. An alternative is with(Y, ifelse(val 0, val, -val)) [1] 3 2 1 0 1 2 3 4 5 6 7 I hope this helps, John John Fox Senator William McMaster Professor of Social Statistics Department of Sociology McMaster University Hamilton, Ontario, Canada http://socserv.mcmaster.ca/jfox - Alex [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sapply Call Returning the condition has length 1 Error
John, Thank you for your comment. There is no secret. But the actual function I need to call is rather irrelevant. However don't take it as the abs function. If you like to know, it is a function that converts 4 kinds of old ids from several old database tables into a new id in a new database. Again, I don't think providing such detail is better than saying MyDummyFunc maps a number into a number but doesn't work with vectors. All I need to do, is to call DummyFunc for every element in a column of a data.frame and returns the resulted vector. But, I cannot change DummyFunc. Correct me if I am wrong: this is rather common in a group collaboration enviroment. Person A may be responsible for writing a function and person B who needs to use that function cannot or better not change it. Obviously, I could write a loop. Michael in a previous post suggested using vectorize which works perfectly. As a newbie of R, I would wish to learn more ways to achieve my goal (sorry, it automatically involves how not just what ;). Is there a way using a *apply function to do it where * stands for any function. Thanks a lot! From: John Fox j...@mcmaster.ca To: 'Alex Zhang' alex.zh...@ymail.com Cc: r-help@r-project.org Sent: Tuesday, December 27, 2011 4:06 PM Subject: RE: [R] sapply Call Returning the condition has length 1 Error Dear Alex, -Original Message- From: Alex Zhang [mailto:alex.zh...@ymail.com] Sent: December-27-11 3:34 PM To: John Fox Cc: r-help@r-project.org Subject: Re: [R] sapply Call Returning the condition has length 1 Error John, Thanks for the pointers. The DummyFunc is just a made-up example. The true function I need to use is more complicated and would be distractive to include. You'll probably get a better answer if you don't keep what you want to do a secret. Do you mean that sapply would take columns in the input data.frame and feed them into FUN as whole vectors? That explains the behavior. Yes. As I said, a data frame is a list of columns, so FUN is called with each column as its argument. Is there an *apply function that will fee elements of the input data.frame into FUN instead of whole columns? Thanks. I'm afraid that I don't know what you mean. Do you want to deal with the columns of the data frame separately (in general, they need not all be of the same class), and within each column, apply a function separately to each element? You could nest calls to lapply() or sapply(), as in sapply(D, function(DD) sapply(DD, abs)) assuming, of course, that D is an entirely numeric data frame. But in this case, abs(as.matrix(D)) would be more sensible, and using sapply() like this isn't necessarily better than a loop. Again, not knowing what you want to do makes it hard to suggest a solution. Best, John From: John Fox j...@mcmaster.ca To: 'Alex Zhang' alex.zh...@ymail.com Cc: r-help@r-project.org Sent: Tuesday, December 27, 2011 3:10 PM Subject: RE: [R] sapply Call Returning the condition has length 1 Error Dear Alex, -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of Alex Zhang Sent: December-27-11 2:14 PM To: r-help@r-project.org Subject: [R] sapply Call Returning the condition has length 1 Error Dear all, Happy new year! I have a question re using sapply. Below is a dummy example that would replicate the error I saw. ##Code Starts here DummyFunc - function(x) { if (x 0) { return (x) } else { return (-x) } } Y = data.frame(val = c(-3:7)) sapply(Y, FUN = DummyFunc) ##Code ends here When I run it, I got: val [1,] 3 [2,] 2 [3,] 1 [4,] 0 [5,] -1 [6,] -2 [7,] -3 [8,] -4 [9,] -5 [10,] -6 [11,] -7 Warning message: In if (x 0) { : the condition has length 1 and only the first element will be used The result is different from what I would expect plus there is such an error message. This is a warning, not really an error message. A data frame is essentially a list of variables (columns), and sapply() applies its FUN argument to each list element, that is, each variable -- the one variable val in your case. That produces a warning because val 0 is a vector of 11 elements, and the first comparison, 3 0, which is TRUE, controls the result. I guess if the DummyFunc I provided is compatible with vectors, the problem would go away. But let's suppose I cannot change DummyFunc. Is there still a way to use sapply or alike without actually writing a loop? Thanks. Well, you could just use abs(Y$val) [1] 3 2 1 0 1 2 3 4 5 6 7 but I suppose that you didn't really want to write your own version of the absolute-value function as something more than an exercise. An alternative is with(Y, ifelse(val 0, val, -val)) [1] 3 2 1 0 1 2 3 4 5 6 7
Re: [R] sapply Call Returning the condition has length 1 Error
I suggest you (re-?)read the posting guide. Proper etiquette on this list is to provide a fully self-contained (reproducible) example that demonstrates your problem. You are free to supply an alternate to your actual function as long as it illustrates your problem and you can deal with the re-substitution on your own. --- Jeff NewmillerThe . . Go Live... DCN:jdnew...@dcn.davis.ca.usBasics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. Alex Zhang alex.zh...@ymail.com wrote: John, Thank you for your comment.� There is no secret. But the actual function I need to call is rather irrelevant.�However don't take it as the abs function. If you like to know, it is a function that converts 4 kinds of old ids from several old database tables into a new id in a new database. Again, I don't think providing�such detail is better than saying MyDummyFunc maps a number into a number but doesn't work�with vectors. All I need to do, is to call DummyFunc for every element in a column of a data.frame and returns�the resulted vector. But, I cannot change DummyFunc. Correct me if I am wrong: this is rather�common in a group�collaboration enviroment. Person A may be responsible for writing a function and person B who needs to use that function cannot or better not change it. Obviously, I could write a loop. Michael in a previous post suggested using vectorize which works perfectly. As a newbie of R, I would wish to learn more ways to achieve my goal (sorry, it automatically involves how not just what ;). Is there a way using�a *apply function to do it where * stands for any function. Thanks a lot!� From: John Fox j...@mcmaster.ca To: 'Alex Zhang' alex.zh...@ymail.com Cc: r-help@r-project.org Sent: Tuesday, December 27, 2011 4:06 PM Subject: RE: [R] sapply Call Returning the condition has length 1 Error Dear Alex, -Original Message- From: Alex Zhang [mailto:alex.zh...@ymail.com] Sent: December-27-11 3:34 PM To: John Fox Cc: r-help@r-project.org Subject: Re: [R] sapply Call Returning the condition has length 1 Error John, Thanks for the pointers. The DummyFunc is just a made-up example. The true function I need to use is more complicated and would be distractive to include. You'll probably get a better answer if you don't keep what you want to do a secret. Do you mean that sapply would take columns in the input data.frame and feed them into FUN as whole vectors? That explains the behavior. Yes. As I said, a data frame is a list of columns, so FUN is called with each column as its argument. Is there an *apply function that will fee elements of the input data.frame into FUN instead of whole columns? Thanks. I'm afraid that I don't know what you mean. Do you want to deal with the columns of the data frame separately (in general, they need not all be of the same class), and within each column, apply a function separately to each element? You could nest calls to lapply() or sapply(), as in sapply(D, function(DD) sapply(DD, abs)) assuming, of course, that D is an entirely numeric data frame. But in this case, abs(as.matrix(D)) would be more sensible, and using sapply() like this isn't necessarily better than a loop. Again, not knowing what you want to do makes it hard to suggest a solution. Best, John From: John Fox j...@mcmaster.ca To: 'Alex Zhang' alex.zh...@ymail.com Cc: r-help@r-project.org Sent: Tuesday, December 27, 2011 3:10 PM Subject: RE: [R] sapply Call Returning the condition has length 1 Error Dear Alex, -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of Alex Zhang Sent: December-27-11 2:14 PM To: r-help@r-project.org Subject: [R] sapply Call Returning the condition has length 1 Error Dear all, Happy new year! I have a question re using sapply. Below is a dummy example that would replicate the error I saw. ##Code Starts here DummyFunc - function(x) { if (x 0) { return (x) } else { return (-x) } } Y = data.frame(val = c(-3:7)) sapply(Y, FUN = DummyFunc) ##Code ends here When I run it, I got: � � � val � [1,]� 3 � [2,]� 2 � [3,]� 1 � [4,]� 0 � [5,]� -1 � [6,]� -2 � [7,]� -3 � [8,]� -4 � [9,]� -5 [10,]� -6 [11,]� -7 Warning message: In if (x 0) { : � the condition has length 1 and only the first element will be used The result is different from what I would expect plus
Re: [R] sapply Call Returning the condition has length 1 Error
Jeff, Could you please tell me which part of the guide that I didn't follow? I provide a piece of code that can run in R, producing the problem and I also provided my results. Other people wanted to help me to learn more so asked me about more details. I replied very carefully explaining that i had provided all relevant information. Jeff, before you response, did you read any of previous posts? Sent from my iPhone On Dec 27, 2011, at 6:19 PM, Jeff Newmiller jdnew...@dcn.davis.ca.us wrote: I suggest you (re-?)read the posting guide. Proper etiquette on this list is to provide a fully self-contained (reproducible) example that demonstrates your problem. You are free to supply an alternate to your actual function as long as it illustrates your problem and you can deal with the re-substitution on your own. --- Jeff NewmillerThe . . Go Live... DCN:jdnew...@dcn.davis.ca.usBasics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. Alex Zhang alex.zh...@ymail.com wrote: John, Thank you for your comment.� There is no secret. But the actual function I need to call is rather irrelevant.�However don't take it as the abs function. If you like to know, it is a function that converts 4 kinds of old ids from several old database tables into a new id in a new database. Again, I don't think providing�such detail is better than saying MyDummyFunc maps a number into a number but doesn't work�with vectors. All I need to do, is to call DummyFunc for every element in a column of a data.frame and returns�the resulted vector. But, I cannot change DummyFunc. Correct me if I am wrong: this is rather�common in a group�collaboration enviroment. Person A may be responsible for writing a function and person B who needs to use that function cannot or better not change it. Obviously, I could write a loop. Michael in a previous post suggested using vectorize which works perfectly. As a newbie of R, I would wish to learn more ways to achieve my goal (sorry, it automatically involves how not just what ;). Is there a way using�a *apply function to do it where * stands for any function. Thanks a lot!� From: John Fox j...@mcmaster.ca To: 'Alex Zhang' alex.zh...@ymail.com Cc: r-help@r-project.org Sent: Tuesday, December 27, 2011 4:06 PM Subject: RE: [R] sapply Call Returning the condition has length 1 Error Dear Alex, -Original Message- From: Alex Zhang [mailto:alex.zh...@ymail.com] Sent: December-27-11 3:34 PM To: John Fox Cc: r-help@r-project.org Subject: Re: [R] sapply Call Returning the condition has length 1 Error John, Thanks for the pointers. The DummyFunc is just a made-up example. The true function I need to use is more complicated and would be distractive to include. You'll probably get a better answer if you don't keep what you want to do a secret. Do you mean that sapply would take columns in the input data.frame and feed them into FUN as whole vectors? That explains the behavior. Yes. As I said, a data frame is a list of columns, so FUN is called with each column as its argument. Is there an *apply function that will fee elements of the input data.frame into FUN instead of whole columns? Thanks. I'm afraid that I don't know what you mean. Do you want to deal with the columns of the data frame separately (in general, they need not all be of the same class), and within each column, apply a function separately to each element? You could nest calls to lapply() or sapply(), as in sapply(D, function(DD) sapply(DD, abs)) assuming, of course, that D is an entirely numeric data frame. But in this case, abs(as.matrix(D)) would be more sensible, and using sapply() like this isn't necessarily better than a loop. Again, not knowing what you want to do makes it hard to suggest a solution. Best, John From: John Fox j...@mcmaster.ca To: 'Alex Zhang' alex.zh...@ymail.com Cc: r-help@r-project.org Sent: Tuesday, December 27, 2011 3:10 PM Subject: RE: [R] sapply Call Returning the condition has length 1 Error Dear Alex, -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of Alex Zhang Sent: December-27-11 2:14 PM To: r-help@r-project.org Subject: [R] sapply Call Returning the condition has length 1 Error Dear all, Happy new year! I have a question re using sapply. Below is a dummy example
Re: [R] sapply Call Returning the condition has length 1 Error
Your puzzle comes from a collision of two somewhat subtle facts that i) sapply() is a wrapper for lapply(), not apply() and ii) data.frame()s are secretly columnwise lists. Because of this, sapply = lapply takes each list element = data.frame column and passes it to the column individually. Compare this behavior to lapply(1:4, function(x) max(x^2)) which converts its not-list input (=c(1,2,3,4)) into a list (= list(1,2,3,4)) before processing. This is different than lapply(list(1:4), function(x) max(x^2)) If you want to work element wise, you can work with apply() rather than sapply(), Vectorize(), or just a plain-ol' for loop. Does this help? Michael PS -- You should nag your collaborator about making a non-vectorized function. If you got the warning message that started this all off, there's likely a bug in his code of the if+else vs ifelse variety. If you've never seen a document called the R inferno before, Google it, and take a look through: it's full of all sorts of helpful intermediate level tips and these sorts of subtleties are well documented. On Tue, Dec 27, 2011 at 4:03 PM, Alex Zhang alex.zh...@ymail.com wrote: John, Thank you for your comment. There is no secret. But the actual function I need to call is rather irrelevant. However don't take it as the abs function. If you like to know, it is a function that converts 4 kinds of old ids from several old database tables into a new id in a new database. Again, I don't think providing such detail is better than saying MyDummyFunc maps a number into a number but doesn't work with vectors. All I need to do, is to call DummyFunc for every element in a column of a data.frame and returns the resulted vector. But, I cannot change DummyFunc. Correct me if I am wrong: this is rather common in a group collaboration enviroment. Person A may be responsible for writing a function and person B who needs to use that function cannot or better not change it. Obviously, I could write a loop. Michael in a previous post suggested using vectorize which works perfectly. As a newbie of R, I would wish to learn more ways to achieve my goal (sorry, it automatically involves how not just what ;). Is there a way using a *apply function to do it where * stands for any function. Thanks a lot! From: John Fox j...@mcmaster.ca To: 'Alex Zhang' alex.zh...@ymail.com Cc: r-help@r-project.org Sent: Tuesday, December 27, 2011 4:06 PM Subject: RE: [R] sapply Call Returning the condition has length 1 Error Dear Alex, -Original Message- From: Alex Zhang [mailto:alex.zh...@ymail.com] Sent: December-27-11 3:34 PM To: John Fox Cc: r-help@r-project.org Subject: Re: [R] sapply Call Returning the condition has length 1 Error John, Thanks for the pointers. The DummyFunc is just a made-up example. The true function I need to use is more complicated and would be distractive to include. You'll probably get a better answer if you don't keep what you want to do a secret. Do you mean that sapply would take columns in the input data.frame and feed them into FUN as whole vectors? That explains the behavior. Yes. As I said, a data frame is a list of columns, so FUN is called with each column as its argument. Is there an *apply function that will fee elements of the input data.frame into FUN instead of whole columns? Thanks. I'm afraid that I don't know what you mean. Do you want to deal with the columns of the data frame separately (in general, they need not all be of the same class), and within each column, apply a function separately to each element? You could nest calls to lapply() or sapply(), as in sapply(D, function(DD) sapply(DD, abs)) assuming, of course, that D is an entirely numeric data frame. But in this case, abs(as.matrix(D)) would be more sensible, and using sapply() like this isn't necessarily better than a loop. Again, not knowing what you want to do makes it hard to suggest a solution. Best, John From: John Fox j...@mcmaster.ca To: 'Alex Zhang' alex.zh...@ymail.com Cc: r-help@r-project.org Sent: Tuesday, December 27, 2011 3:10 PM Subject: RE: [R] sapply Call Returning the condition has length 1 Error Dear Alex, -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of Alex Zhang Sent: December-27-11 2:14 PM To: r-help@r-project.org Subject: [R] sapply Call Returning the condition has length 1 Error Dear all, Happy new year! I have a question re using sapply. Below is a dummy example that would replicate the error I saw. ##Code Starts here DummyFunc - function(x) { if (x 0) { return (x) } else { return (-x) } } Y = data.frame(val = c(-3:7)) sapply(Y, FUN = DummyFunc) ##Code ends here When I run it, I got: val [1,] 3 [2,] 2 [3,] 1 [4,] 0 [5
Re: [R] sapply Call Returning the condition has length 1 Error
Michael, John and Bert, Thank you all very much for your help. I think I have gained a lot of understanding and am ready to write better code. I will study The R Inferno. I appreciate it. - Alex From: R. Michael Weylandt michael.weyla...@gmail.com To: Alex Zhang alex.zh...@ymail.com Cc: John Fox j...@mcmaster.ca; r-help@r-project.org r-help@r-project.org Sent: Tuesday, December 27, 2011 6:59 PM Subject: Re: [R] sapply Call Returning the condition has length 1 Error Your puzzle comes from a collision of two somewhat subtle facts that i) sapply() is a wrapper for lapply(), not apply() and ii) data.frame()s are secretly columnwise lists. Because of this, sapply = lapply takes each list element = data.frame column and passes it to the column individually. Compare this behavior to lapply(1:4, function(x) max(x^2)) which converts its not-list input (=c(1,2,3,4)) into a list (= list(1,2,3,4)) before processing. This is different than lapply(list(1:4), function(x) max(x^2)) If you want to work element wise, you can work with apply() rather than sapply(), Vectorize(), or just a plain-ol' for loop. Does this help? Michael PS -- You should nag your collaborator about making a non-vectorized function. If you got the warning message that started this all off, there's likely a bug in his code of the if+else vs ifelse variety. If you've never seen a document called the R inferno before, Google it, and take a look through: it's full of all sorts of helpful intermediate level tips and these sorts of subtleties are well documented. On Tue, Dec 27, 2011 at 4:03 PM, Alex Zhang alex.zh...@ymail.com wrote: John, Thank you for your comment. There is no secret. But the actual function I need to call is rather irrelevant. However don't take it as the abs function. If you like to know, it is a function that converts 4 kinds of old ids from several old database tables into a new id in a new database. Again, I don't think providing such detail is better than saying MyDummyFunc maps a number into a number but doesn't work with vectors. All I need to do, is to call DummyFunc for every element in a column of a data.frame and returns the resulted vector. But, I cannot change DummyFunc. Correct me if I am wrong: this is rather common in a group collaboration enviroment. Person A may be responsible for writing a function and person B who needs to use that function cannot or better not change it. Obviously, I could write a loop. Michael in a previous post suggested using vectorize which works perfectly. As a newbie of R, I would wish to learn more ways to achieve my goal (sorry, it automatically involves how not just what ;). Is there a way using a *apply function to do it where * stands for any function. Thanks a lot! From: John Fox j...@mcmaster.ca To: 'Alex Zhang' alex.zh...@ymail.com Cc: r-help@r-project.org Sent: Tuesday, December 27, 2011 4:06 PM Subject: RE: [R] sapply Call Returning the condition has length 1 Error Dear Alex, -Original Message- From: Alex Zhang [mailto:alex.zh...@ymail.com] Sent: December-27-11 3:34 PM To: John Fox Cc: r-help@r-project.org Subject: Re: [R] sapply Call Returning the condition has length 1 Error John, Thanks for the pointers. The DummyFunc is just a made-up example. The true function I need to use is more complicated and would be distractive to include. You'll probably get a better answer if you don't keep what you want to do a secret. Do you mean that sapply would take columns in the input data.frame and feed them into FUN as whole vectors? That explains the behavior. Yes. As I said, a data frame is a list of columns, so FUN is called with each column as its argument. Is there an *apply function that will fee elements of the input data.frame into FUN instead of whole columns? Thanks. I'm afraid that I don't know what you mean. Do you want to deal with the columns of the data frame separately (in general, they need not all be of the same class), and within each column, apply a function separately to each element? You could nest calls to lapply() or sapply(), as in sapply(D, function(DD) sapply(DD, abs)) assuming, of course, that D is an entirely numeric data frame. But in this case, abs(as.matrix(D)) would be more sensible, and using sapply() like this isn't necessarily better than a loop. Again, not knowing what you want to do makes it hard to suggest a solution. Best, John From: John Fox j...@mcmaster.ca To: 'Alex Zhang' alex.zh...@ymail.com Cc: r-help@r-project.org Sent: Tuesday, December 27, 2011 3:10 PM Subject: RE: [R] sapply Call Returning the condition has length 1 Error Dear Alex, -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of Alex Zhang Sent: December-27-11 2:14 PM To: r-help@r
Re: [R] sapply(pred,cor,y=resp)
Hi, It is probably more confusing with several steps combined, but you are correct that it is because there are NAs. It is fairly common for R functions to return NA if there are any NA values unless you explicitly set an argument on what to do with missing values. A quick look at ?cor clearly shows that there is such an argument with several options. Try adding ', use = pairwise.complete.obs ' to your sapply call. Hope that helps, Josh On Sun, Oct 9, 2011 at 12:47 AM, William Claster dmfall2...@yahoo.com wrote: Hello. I am wondering why I am getting NA for all in cors=sapply(pred,cor,y=resp). I suppose that each column in pred has NAs in them. Is there some way to fix this? Thanks str(pred) 'data.frame': 200 obs. of 13 variables: $ mnO2: num 9.8 8 11.4 4.8 9 13.1 10.3 10.6 3.4 9.9 ... $ Cl : num 60.8 57.8 40 77.4 55.4 ... $ NO3 : num 6.24 1.29 5.33 2.3 10.42 ... $ NH4 : num 578 370 346.7 98.2 233.7 ... $ oPO4: num 105 428.8 125.7 61.2 58.2 ... $ PO4 : num 170 558.8 187.1 138.7 97.6 ... $ Chla: num 50 1.3 15.6 1.4 10.5 ... $ a1 : num 0 1.4 3.3 3.1 9.2 15.1 2.4 18.2 25.4 17 ... $ a2 : num 0 7.6 53.6 41 2.9 14.6 1.2 1.6 5.4 0 ... $ a3 : num 0 4.8 1.9 18.9 7.5 1.4 3.2 0 2.5 0 ... $ a4 : num 0 1.9 0 0 0 0 3.9 0 0 2.9 ... $ a5 : num 34.2 6.7 0 1.4 7.5 22.5 5.8 5.5 0 0 ... $ a6 : num 8.3 0 0 0 4.1 12.6 6.8 8.7 0 0 ... str(y=resp) Error in length(object) : 'object' is missing str(resp) num [1:200] 8 8.35 8.1 8.07 8.06 8.25 8.15 8.05 8.7 7.93 ... cors=sapply(pred,cor,y=resp) cors mnO2 Cl NO3 NH4 oPO4 PO4 Chla a1 a2 a3 a4 a5 a6 NA NA NA NA NA NA NA NA NA NA NA NA NA [[alternative HTML version deleted]] In the future, please post in plain text, not HTML (as the posting guide requests). __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology Programmer Analyst II, ATS Statistical Consulting Group University of California, Los Angeles https://joshuawiley.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sapply to bind columns, with repeat?
Hi Weidong Gu, This works! For my clarity, and so I can repeat this process if need be: The 'mat' generates a matrix using whatever is supplied to x (i.e. coop.dat) using the columns from position 9:length(x) of 6 columns (by row). The 'rem.col' generates a matrix of the first 1:8 columns of 8 columns. The 'return' statement calls the function to cbind together rem.col and mat. Then 'apply' this all to coop.dat, by rows, using function reorg. Is this correct? Thank you very much, Katrina On Fri, Aug 12, 2011 at 10:28 AM, Weidong Gu anopheles...@gmail.com wrote: Katrina, try this. reorg-function(x){ mat-matrix(x[9:length(x)],ncol=6,byrow=T) rem.col-matrix(rep(x[1:8],nrow(mat)),byrow=T,ncol=8) return(data.frame(cbind(rem.col,mat))) } co-do.call('rbind',apply(coop.dat,1,function(x) reorg(x))) You may need to tweak a bit to fit exactly what you want. Weidong Gu On Fri, Aug 12, 2011 at 2:35 AM, Katrina Bennett kebenn...@alaska.edu wrote: Hi R-help, I am working with US COOP network station data and the files are concatenated in single rows for all years, but I need to pull these apart into rows for each day. To do this, I need to extract part of each row such as station id, year, mo, and repeat this against other variables in the row (days). My problem is that there are repeated values for each day, and the files are fixed width field without order. Here is an example of just one line of data. coop.raw - c(DLY09752806TMAX F2010010620107 00049 20107 00062 B0207 00041 20207 00049 B0307 00040 20307 00041 B0407 00042 20407 00040 B0507 00041 20507 00042 B0607 00043 20607 00041 B0707 00055 20707 00043 B0807 00039 20807 00055 B0907 00037 20907 00039 B1007 00038 21007 00037 B1107 00048 21107 00038 B1207 00050 21207 00048 B1307 00051 21307 00050 B1407 00058 21407 00051 B1507 00068 21507 00058 B1607 00065 21607 00068 B1707 00068 21707 00065 B1807 00067 21807 00068 B1907 00068 21907 00067 B2007 00069 22007 00068 B2107 00057 22107 00069 B2207 00048 22207 00057 B2307 00051 22307 00048 B2407 00073 22407 00051 B2507 00062 22507 00073 B2607 00056 22607 00062 B2707 00053 22707 00056 B2807 00064 22807 00053 B2907 00057 22907 00064 B3007 00047 23007 00057 B3107 00046 23107 00047 B) write.csv(coop.raw, coop.tmp, row.names=F, quote=F) coop.dat - read.fwf(coop.tmp, widths = c(c(3,8,4,2,4,2,4,3),rep(c(2,2,1,5,1,1),62)), na.strings=c(), skip=1, as.is=T) rep.name - rep(c(day,hr,met,dat,fl1,fl2), 62) rep.count - rep(c(1:62), each=6, 1) names(coop.dat) - c(rect, id, elem, unt, year, mo, fill, numval, paste(rep.name, rep.count, sep=_)) I would like to generate output that contains in one row, the columns id, elem, unt, year, mo, and numval. Binded to these initial columns, I would like only day_1, hr_1, met_1, dat_1, fl1_1, and fl2_1. Then, in the next row I would like repeated the initial columns id, elem, unt, year, mo, and numval and then binded day_2, hr_2, met_2, dat_2, fl1_2, and f2_2 and so on until all the data for all rows has been allocated. Then, move onto the next row and repeat. I think I should be able to do this with some sort of sapply or lapply function, but I'm struggling with the format for repeating the initial columns, and then skipping through the next columns. Thank you, Katrina __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sapply to bind columns, with repeat?
Katrina, try this. reorg-function(x){ mat-matrix(x[9:length(x)],ncol=6,byrow=T) rem.col-matrix(rep(x[1:8],nrow(mat)),byrow=T,ncol=8) return(data.frame(cbind(rem.col,mat))) } co-do.call('rbind',apply(coop.dat,1,function(x) reorg(x))) You may need to tweak a bit to fit exactly what you want. Weidong Gu On Fri, Aug 12, 2011 at 2:35 AM, Katrina Bennett kebenn...@alaska.edu wrote: Hi R-help, I am working with US COOP network station data and the files are concatenated in single rows for all years, but I need to pull these apart into rows for each day. To do this, I need to extract part of each row such as station id, year, mo, and repeat this against other variables in the row (days). My problem is that there are repeated values for each day, and the files are fixed width field without order. Here is an example of just one line of data. coop.raw - c(DLY09752806TMAX F2010010620107 00049 20107 00062 B0207 00041 20207 00049 B0307 00040 20307 00041 B0407 00042 20407 00040 B0507 00041 20507 00042 B0607 00043 20607 00041 B0707 00055 20707 00043 B0807 00039 20807 00055 B0907 00037 20907 00039 B1007 00038 21007 00037 B1107 00048 21107 00038 B1207 00050 21207 00048 B1307 00051 21307 00050 B1407 00058 21407 00051 B1507 00068 21507 00058 B1607 00065 21607 00068 B1707 00068 21707 00065 B1807 00067 21807 00068 B1907 00068 21907 00067 B2007 00069 22007 00068 B2107 00057 22107 00069 B2207 00048 22207 00057 B2307 00051 22307 00048 B2407 00073 22407 00051 B2507 00062 22507 00073 B2607 00056 22607 00062 B2707 00053 22707 00056 B2807 00064 22807 00053 B2907 00057 22907 00064 B3007 00047 23007 00057 B3107 00046 23107 00047 B) write.csv(coop.raw, coop.tmp, row.names=F, quote=F) coop.dat - read.fwf(coop.tmp, widths = c(c(3,8,4,2,4,2,4,3),rep(c(2,2,1,5,1,1),62)), na.strings=c(), skip=1, as.is=T) rep.name - rep(c(day,hr,met,dat,fl1,fl2), 62) rep.count - rep(c(1:62), each=6, 1) names(coop.dat) - c(rect, id, elem, unt, year, mo, fill, numval, paste(rep.name, rep.count, sep=_)) I would like to generate output that contains in one row, the columns id, elem, unt, year, mo, and numval. Binded to these initial columns, I would like only day_1, hr_1, met_1, dat_1, fl1_1, and fl2_1. Then, in the next row I would like repeated the initial columns id, elem, unt, year, mo, and numval and then binded day_2, hr_2, met_2, dat_2, fl1_2, and f2_2 and so on until all the data for all rows has been allocated. Then, move onto the next row and repeat. I think I should be able to do this with some sort of sapply or lapply function, but I'm struggling with the format for repeating the initial columns, and then skipping through the next columns. Thank you, Katrina __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sapply to bind columns, with repeat?
On Fri, Aug 12, 2011 at 5:08 PM, Katrina Bennett kebenn...@alaska.edu wrote: Hi Weidong Gu, This works! For my clarity, and so I can repeat this process if need be: The 'mat' generates a matrix using whatever is supplied to x (i.e. coop.dat) using the columns from position 9:length(x) of 6 columns (by row). x is passed as a row of coop.dat at a time. then, using matrix() to split the part of the row (9:length(x)) into a matrix of 6 columns, note using byrow parameter The 'rem.col' generates a matrix of the first 1:8 columns of 8 columns. this would generate repeated the first 8 columns, note using nrow(mat) to match the number of repeated columns to that of mat. The 'return' statement calls the function to cbind together rem.col and mat. return a list of data.frame Then 'apply' this all to coop.dat, by rows, using function reorg. do.call ('rbind', retured list of df) would combine the list into one data frame you can see what is going on temp-apply(coop.dat,1,function(x) reorg(x))) str(temp) do.call('rbind',temp) HTH Weidong Gu Is this correct? Thank you very much, Katrina On Fri, Aug 12, 2011 at 10:28 AM, Weidong Gu anopheles...@gmail.com wrote: Katrina, try this. reorg-function(x){ mat-matrix(x[9:length(x)],ncol=6,byrow=T) rem.col-matrix(rep(x[1:8],nrow(mat)),byrow=T,ncol=8) return(data.frame(cbind(rem.col,mat))) } co-do.call('rbind',apply(coop.dat,1,function(x) reorg(x))) You may need to tweak a bit to fit exactly what you want. Weidong Gu On Fri, Aug 12, 2011 at 2:35 AM, Katrina Bennett kebenn...@alaska.edu wrote: Hi R-help, I am working with US COOP network station data and the files are concatenated in single rows for all years, but I need to pull these apart into rows for each day. To do this, I need to extract part of each row such as station id, year, mo, and repeat this against other variables in the row (days). My problem is that there are repeated values for each day, and the files are fixed width field without order. Here is an example of just one line of data. coop.raw - c(DLY09752806TMAX F2010010620107 00049 20107 00062 B0207 00041 20207 00049 B0307 00040 20307 00041 B0407 00042 20407 00040 B0507 00041 20507 00042 B0607 00043 20607 00041 B0707 00055 20707 00043 B0807 00039 20807 00055 B0907 00037 20907 00039 B1007 00038 21007 00037 B1107 00048 21107 00038 B1207 00050 21207 00048 B1307 00051 21307 00050 B1407 00058 21407 00051 B1507 00068 21507 00058 B1607 00065 21607 00068 B1707 00068 21707 00065 B1807 00067 21807 00068 B1907 00068 21907 00067 B2007 00069 22007 00068 B2107 00057 22107 00069 B2207 00048 22207 00057 B2307 00051 22307 00048 B2407 00073 22407 00051 B2507 00062 22507 00073 B2607 00056 22607 00062 B2707 00053 22707 00056 B2807 00064 22807 00053 B2907 00057 22907 00064 B3007 00047 23007 00057 B3107 00046 23107 00047 B) write.csv(coop.raw, coop.tmp, row.names=F, quote=F) coop.dat - read.fwf(coop.tmp, widths = c(c(3,8,4,2,4,2,4,3),rep(c(2,2,1,5,1,1),62)), na.strings=c(), skip=1, as.is=T) rep.name - rep(c(day,hr,met,dat,fl1,fl2), 62) rep.count - rep(c(1:62), each=6, 1) names(coop.dat) - c(rect, id, elem, unt, year, mo, fill, numval, paste(rep.name, rep.count, sep=_)) I would like to generate output that contains in one row, the columns id, elem, unt, year, mo, and numval. Binded to these initial columns, I would like only day_1, hr_1, met_1, dat_1, fl1_1, and fl2_1. Then, in the next row I would like repeated the initial columns id, elem, unt, year, mo, and numval and then binded day_2, hr_2, met_2, dat_2, fl1_2, and f2_2 and so on until all the data for all rows has been allocated. Then, move onto the next row and repeat. I think I should be able to do this with some sort of sapply or lapply function, but I'm struggling with the format for repeating the initial columns, and then skipping through the next columns. Thank you, Katrina __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sapply( ) a loop function
I am not sure what purpose the while loop has. However, the main problem seems to be that you need to put: i-sample(1:(n-40),1) #This sample from 1 to n-40 rather than i-sample(1:n-40,1) #this samples one 1:n and then subtracts 40 Otherwise, you may get negative index values Best, Daniel bignami83 wrote: Hello R-universe... I am having trouble writing a function which contains a loop so I can sapply() it to a list of data frames Each data frame has 241 observations of 15 variables. My loop takes a random sample of one row until the 40 consecutive rows after the sample have a d2p(variable) sum greater than 5. here is my loop code (it works fine when applied to a 241 observation data frame): n-241 test-0 while(test5){ i-sample(1:n-40,1) x-my.data.frame[seq(from=i, to=i+40),] test-sumx[,d2p],na.rm=TRUE) }; i ; test I need this loop to be applied to EACH DATA FRAME in a list of 360 data frames created by splitting my master data frame, d, by each fish number. (contains observations for 360 fish, each with 241 observations of 15 variables). split.df-split(d,d$fish) I'm kind of new at writing functions, so I'm sure this probably doesn't make much sense, but here is what I tried for the function: samp.func-function(f) { n-241 test-0 while(test5){ i-sample(1:n-40,1) x-f[seq(from=i, to=i+40),] test-sum(x[,d2p],na.rm=TRUE) }} and then tried to apply it to my list of data frames (NOT WORKING): sapply(split.df,samp.func) I'm pretty sure I'm missing some way to instruct that I want the loop to cycle through for each item (data frame) in my split.df list. I've tried to play around with including for loops with no avail... any ideas? here is code for a simplified mock single-fish data frame (i.e. one item in my split.df list) to plug into the loop if you want to try it. fish-rep(1,241) d2p-seq((50/241),50,(50/241)) my.data.frame-data.frame(fish,d2p) thanks!! Sean [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://r.789695.n4.nabble.com/sapply-a-loop-function-tp3737261p3737481.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sapply( ) a loop function
Hi: samp_func() doesn't return anything. Either (1) type test as the last line of the function body or (2) don't assign the last sum to an object. HTH, Dennis On Thu, Aug 11, 2011 at 1:59 PM, Sean Bignami bignam...@gmail.com wrote: Hello R-universe... I am having trouble writing a function which contains a loop so I can sapply() it to a list of data frames Each data frame has 241 observations of 15 variables. My loop takes a random sample of one row until the 40 consecutive rows after the sample have a d2p(variable) sum greater than 5. here is my loop code (it works fine when applied to a 241 observation data frame): n-241 test-0 while(test5){ i-sample(1:n-40,1) x-my.data.frame[seq(from=i, to=i+40),] test-sumx[,d2p],na.rm=TRUE) }; i ; test I need this loop to be applied to EACH DATA FRAME in a list of 360 data frames created by splitting my master data frame, d, by each fish number. (contains observations for 360 fish, each with 241 observations of 15 variables). split.df-split(d,d$fish) I'm kind of new at writing functions, so I'm sure this probably doesn't make much sense, but here is what I tried for the function: samp.func-function(f) { n-241 test-0 while(test5){ i-sample(1:n-40,1) x-f[seq(from=i, to=i+40),] test-sum(x[,d2p],na.rm=TRUE) }} and then tried to apply it to my list of data frames (NOT WORKING): sapply(split.df,samp.func) I'm pretty sure I'm missing some way to instruct that I want the loop to cycle through for each item (data frame) in my split.df list. I've tried to play around with including for loops with no avail... any ideas? here is code for a simplified mock single-fish data frame (i.e. one item in my split.df list) to plug into the loop if you want to try it. fish-rep(1,241) d2p-seq((50/241),50,(50/241)) my.data.frame-data.frame(fish,d2p) thanks!! Sean [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sapply( ) a loop function
The previous two posters basically covered everything, but since I'm on the train with not too much to do, here's a more detailed response building on what they said. The following code is shovel-ready and can be pasted directly to your command line if you have your main data frame called d available. - BEGIN R CODE - # This will install the (awesome) plyr package if you don't already have it on your machine # Feel free to comment it out if you know you have plyr already. #install.packages(plyr) require(plyr) # This loads the plyr package which adds a bunch of variants on the apply family # We will use a variant below that's more appropriate for what you are trying to do # sapply() wouldn't be a problem, but this lets us have that little bit extra control # over the call that makes R great # Create some sample data # fish = rep(1:3,241*3) # d2p = seq((50/241),50,(50/241)) # d2p = c(d2p,d2p,d2p) # d = data.frame(fish,d2p) # This creates a list of data frames, as you rightly identified. split.df - split(d,d$fish) # Set up the function -- I gave it a whimsical name because I have no idea what this # is intended to do or why you want to do it. Of course, you should feel free to adapt it to # something more professional. I don't know if you've seen the general function syntax before, # but this sets up a function with optional additional inputs; defaults are set in the # top of the function definition. They match the parameters of your problem, so you # won't need to change them later. FunFishFunction - function(fish, n = 241, n2 = 40, chemLevel = 5, chemName = d2p) { test - 0; iters = 0 while(test chemLevel){ i - sample(1:(n-n2),1) test = sum(fish[i:(i+n2), chemName], na.rm=T) # Some catch code to make sure you don't wind up in an infinite loop # If you get to a large number of iterations, this will check to make sure # it is possible to find the desired place and return it if possible; marks both of these escape cases # so they can be identified if desired later. if (iters 50) { s = cumsum(na.omit(fish[,chemName])) s = diff(s, lag = n2) if (max(s) chemLevel) { return(c(Inf,max(s))) # Return an Infinity to indicate it can't be done. } else {return(c(which.max(schemLevel), chemLevel))} # Return the first passing spot and the critical value } iters = iters + 1 } return(c(i,test)) # As noted, it's good form to explicitly return any results in a user-defined function # By doing so here, we could halt function execution as noted in our escape code above # We simply bind them together with c() here so both can be directly returned } # This is where the plyr package comes in. It takes in a l -- list, plys it and returns # a d -- data frame. The general plyr package has variants like llply or aaply or # daply or whatever you may need. (a = array) # It also provides a nifty progress bar for longer calculations so you don't drive yourself crazy waiting :-) # If you want to override the defaults we set above, simply put n = 250 or whatever between the # function name and the call to the progress bar. # The general syntax is tTply(IN,FUNC,FUNC_OPTIONS,PLY_OPTIONS) # t is a code indicating the type of thing going in # T indicates the type of thing coming out # IN is the thing going in # FUNC is the function to be plyd # FUNC_OPTIONS are additional things to be passed to FUNC. There are none here # PLY_OPTIONS are additional options for the **ply function -- here we invoke a progress bar. FishConclusions = ldply(split.df,FunFishFunction, .progress = 'text') colnames(FishConclusions) - c(Fish Number, i,test) # Add informative column names to our data frame print(FishConclusions) END R CODE -- Hopefully this helps! Good luck and feel free to get in touch if I can clarify any of this or be of further help. Michael Weylandt On Thu, Aug 11, 2011 at 8:02 PM, Dennis Murphy djmu...@gmail.com wrote: Hi: samp_func() doesn't return anything. Either (1) type test as the last line of the function body or (2) don't assign the last sum to an object. HTH, Dennis On Thu, Aug 11, 2011 at 1:59 PM, Sean Bignami bignam...@gmail.com wrote: Hello R-universe... I am having trouble writing a function which contains a loop so I can sapply() it to a list of data frames Each data frame has 241 observations of 15 variables. My loop takes a random sample of one row until the 40 consecutive rows after the sample have a d2p(variable) sum greater than 5. here is my loop code (it works fine when applied to a 241 observation data frame): n-241 test-0 while(test5){ i-sample(1:n-40,1) x-my.data.frame[seq(from=i, to=i+40),] test-sumx[,d2p],na.rm=TRUE) }; i ; test I need this loop to be applied
Re: [R] SAPPLY function XXXX
Dan, I am attempting to write a function to count the number of non-missing values of each column in a data frame using the sapply function. I have the following code which is receiving the error message below. n.valid-sapply(data1,sum(!is.na)) Error in !is.na : invalid argument type That's the FUN argument to sapply, which expects a function. is.na is indeed a function, but !is.na is not a function: !is.na Error in !is.na : invalid argument type You need to write your own function to do what you want. Luckily this is easy. Let's write one to count the number of missing values in a vector. countNAs - function(x) { sum(!is.na(x)) } Now you have a function that does what you want, so you can use sapply with it. sapply(data1, countNAs) You could also do an anonymous (unnamed) function within sapply to the same effect. sapply(data1, function(x) sum(!is.na(x))) NB: none of this is tested! --Erik Ultimately, I would like for this to be 1 conponent in a larger function that will produce PROC CONTENTS style output. Something like... data1.contents-data.frame(Variable=names(data1), Class=sapply(data1,class), n.valid=sapply(data1,sum(!is.na)), n.miss=sapply(data1,sum(is.na))) data1.contents Any suggestions/assistance are appreciated. Thank you, Daniel [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SAPPLY function XXXX
Ultimately, I would like for this to be 1 conponent in a larger function that will produce PROC CONTENTS style output. Something like... data1.contents-data.frame(Variable=names(data1), Class=sapply(data1,class), n.valid=sapply(data1,sum(!is.na)), n.miss=sapply(data1,sum(is.na))) data1.contents Also meant to mention to see ?describe in the Hmisc package: E.g., describe(c(NA, 1:10)) There is also a useful method for data.frame objects. --Erik __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SAPPLY function XXXX
Perfect Erik! Thank you! On Wed, May 4, 2011 at 4:22 PM, Erik Iverson er...@ccbr.umn.edu wrote: Dan, I am attempting to write a function to count the number of non-missing values of each column in a data frame using the sapply function. I have the following code which is receiving the error message below. n.valid-sapply(data1,sum(!is.na)) Error in !is.na : invalid argument type That's the FUN argument to sapply, which expects a function. is.na is indeed a function, but !is.na is not a function: !is.na Error in !is.na : invalid argument type You need to write your own function to do what you want. Luckily this is easy. Let's write one to count the number of missing values in a vector. countNAs - function(x) { sum(!is.na(x)) } Now you have a function that does what you want, so you can use sapply with it. sapply(data1, countNAs) You could also do an anonymous (unnamed) function within sapply to the same effect. sapply(data1, function(x) sum(!is.na(x))) NB: none of this is tested! --Erik Ultimately, I would like for this to be 1 conponent in a larger function that will produce PROC CONTENTS style output. Something like... data1.contents-data.frame(Variable=names(data1), Class=sapply(data1,class), n.valid=sapply(data1,sum(!is.na)), n.miss=sapply(data1,sum(is.na))) data1.contents Any suggestions/assistance are appreciated. Thank you, Daniel [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SAPPLY function XXXX
Hi r-help-boun...@r-project.org napsal dne 04.05.2011 22:26:59: Erik Iverson er...@ccbr.umn.edu Odeslal: r-help-boun...@r-project.org 04.05.2011 22:26 Komu Dan Abner dan.abne...@gmail.com Ultimately, I would like for this to be 1 conponent in a larger function that will produce PROC CONTENTS style output. Something like... data1.contents-data.frame(Variable=names(data1), Class=sapply(data1,class), n.valid=sapply(data1,sum(!is.na)), n.miss=sapply(data1,sum(is.na))) data1.contents Also meant to mention to see ?describe in the Hmisc package: E.g., describe(c(NA, 1:10)) There is also a useful method for data.frame objects. colSums(is.na(data1)) colSums(!is.na(data1)) may also show number of missing and nonmissing values in data frame. Regards Petr --Erik __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Sapply for descriptive statistics
Sent from my iPhone On Mar 9, 2011, at 5:59 PM, Tomii dioge...@gmail.com wrote: I try to calculate descriptive statistics for one of the variables in the data frame, however command sapply calculates these statistics for every value of the variable separately. How to make it calculate range (as well as other statistics) for all column? If you want range of all columns try: sapply(as1, range) -- David Here are commands and results: as1$trust [1] 5.957510 5.888664 6.168135 6.419472 5.668796 6.026923 6.456721 7.017946 5.294411 [10] 7.296844 6.479167 5.009000 7.149073 5.932667 5.991000 5.327137 5.453230 5.650350 [19] 5.295608 5.518337 4.875000 6.637000 5.891014 6.726055 10.695650 5.490983 7.290476 [28] 5.728543 4.103689 8.421315 des.trust - sapply(as1$trust, range, na.rm=TRUE) des.trust [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9][,10] [1,] 5.95751 5.888664 6.168135 6.419472 5.668796 6.026923 6.456721 7.017946 5.294411 7.296844 [2,] 5.95751 5.888664 6.168135 6.419472 5.668796 6.026923 6.456721 7.017946 5.294411 7.296844 [,11] [,12][,13][,14] [,15][,16] [,17] [,18] [,19][,20] [,21] [1,] 6.479167 5.009 7.149073 5.932667 5.991 5.327137 5.45323 5.65035 5.295608 5.518337 4.875 [2,] 6.479167 5.009 7.149073 5.932667 5.991 5.327137 5.45323 5.65035 5.295608 5.518337 4.875 [,22][,23][,24][,25][,26][,27][,28] [,29][,30] [1,] 6.637 5.891014 6.726055 10.69565 5.490983 7.290476 5.728543 4.103689 8.421315 [2,] 6.637 5.891014 6.726055 10.69565 5.490983 7.290476 5.728543 4.103689 8.421315 tomii t.klep...@gmail.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Sapply for descriptive statistics
Hi: Perhaps something like this? m - matrix(rnorm(100, m = 10, s = 2), ncol = 5) colnames(m) - paste('V', 1:5, sep = '') # Summary function: summs - function(x) c(mean = mean(x), sd = sd(x), range = diff(range(x))) # Apply to columns of m and transpose the result: t(apply(m, 2, summs)) For the sample I generated, the result is t(apply(m, 2, summs)) mean sdrange V1 10.304586 2.330545 9.016235 V2 9.135212 1.993268 7.528364 V3 9.873094 2.155940 7.554311 V4 10.308073 2.357703 9.483222 V5 10.532111 2.378734 8.073172 HTH, Dennis On Wed, Mar 9, 2011 at 2:59 PM, Tomii dioge...@gmail.com wrote: I try to calculate descriptive statistics for one of the variables in the data frame, however command sapply calculates these statistics for every value of the variable separately. How to make it calculate range (as well as other statistics) for all column? Here are commands and results: as1$trust [1] 5.957510 5.888664 6.168135 6.419472 5.668796 6.026923 6.456721 7.017946 5.294411 [10] 7.296844 6.479167 5.009000 7.149073 5.932667 5.991000 5.327137 5.453230 5.650350 [19] 5.295608 5.518337 4.875000 6.637000 5.891014 6.726055 10.695650 5.490983 7.290476 [28] 5.728543 4.103689 8.421315 des.trust - sapply(as1$trust, range, na.rm=TRUE) des.trust [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9][,10] [1,] 5.95751 5.888664 6.168135 6.419472 5.668796 6.026923 6.456721 7.017946 5.294411 7.296844 [2,] 5.95751 5.888664 6.168135 6.419472 5.668796 6.026923 6.456721 7.017946 5.294411 7.296844 [,11] [,12][,13][,14] [,15][,16] [,17] [,18] [,19][,20] [,21] [1,] 6.479167 5.009 7.149073 5.932667 5.991 5.327137 5.45323 5.65035 5.295608 5.518337 4.875 [2,] 6.479167 5.009 7.149073 5.932667 5.991 5.327137 5.45323 5.65035 5.295608 5.518337 4.875 [,22][,23][,24][,25][,26][,27][,28] [,29][,30] [1,] 6.637 5.891014 6.726055 10.69565 5.490983 7.290476 5.728543 4.103689 8.421315 [2,] 6.637 5.891014 6.726055 10.69565 5.490983 7.290476 5.728543 4.103689 8.421315 tomii t.klep...@gmail.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Sapply for descriptive statistics
Sent from my iPhone On Mar 9, 2011, at 6:13 PM, David Winsemius dwinsem...@comcast.net wrote: Sent from my iPhone On Mar 9, 2011, at 5:59 PM, Tomii dioge...@gmail.com wrote: I try to calculate descriptive statistics for one of the variables in the data frame, however command sapply calculates these statistics for every value of the variable separately. How to make it calculate range (as well as other statistics) for all column? If you want range of all columns try: sapply(as1, range) To expand: you sent separate values to range. If you had tried: sapply(as1[trust], range) pply(as1[trust], range) ... you should have succeeded. -- David Here are commands and results: as1$trust [1] 5.957510 5.888664 6.168135 6.419472 5.668796 6.026923 6.456721 7.017946 5.294411 [10] 7.296844 6.479167 5.009000 7.149073 5.932667 5.991000 5.327137 5.453230 5.650350 [19] 5.295608 5.518337 4.875000 6.637000 5.891014 6.726055 10.695650 5.490983 7.290476 [28] 5.728543 4.103689 8.421315 des.trust - sapply(as1$trust, range, na.rm=TRUE) des.trust [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9][,10] [1,] 5.95751 5.888664 6.168135 6.419472 5.668796 6.026923 6.456721 7.017946 5.294411 7.296844 [2,] 5.95751 5.888664 6.168135 6.419472 5.668796 6.026923 6.456721 7.017946 5.294411 7.296844 [,11] [,12][,13][,14] [,15][,16] [,17] [,18] [,19][,20] [,21] [1,] 6.479167 5.009 7.149073 5.932667 5.991 5.327137 5.45323 5.65035 5.295608 5.518337 4.875 [2,] 6.479167 5.009 7.149073 5.932667 5.991 5.327137 5.45323 5.65035 5.295608 5.518337 4.875 [,22][,23][,24][,25][,26][,27][,28] [,29][,30] [1,] 6.637 5.891014 6.726055 10.69565 5.490983 7.290476 5.728543 4.103689 8.421315 [2,] 6.637 5.891014 6.726055 10.69565 5.490983 7.290476 5.728543 4.103689 8.421315 tomii t.klep...@gmail.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sapply puzzlement
sapply(z, function(row) ...) does not actually grab a row at a time out of 'z'. It grabs a column (because 'z' is a data.frame) You may want: t(apply(z, 1, function(row) row - means)) or: t(t(z) - means) Hope that helps, -David Johnston -- View this message in context: http://r.789695.n4.nabble.com/sapply-puzzlement-tp3243520p3243534.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sapply puzzlement
On Jan 27, 2011, at 7:16 PM, Ernest Adrogué i Calveras wrote: Hi, I have this data.frame with two variables in it, z V1 V2 1 10 8 2 NA 18 3 9 7 4 3 NA 5 NA 10 6 11 12 7 13 9 8 12 11 and a vector of means, means - apply(z, 2, function (col) mean(na.omit(col))) means V1V2 9.67 10.714286 Two methods: A) use sweep (which by default takes the difference) sweep(z, 2, means) V1 V2 1 0.333 -2.7142857 2 NA 7.2857143 3 -0.667 -3.7142857 4 -6.667 NA 5 NA -0.7142857 6 1.333 1.2857143 7 3.333 -1.7142857 8 2.333 0.2857143 B) use the scale function (whose whole purpose in life is to subtract the mean and possibly divide by the standard deviation which we suppressed in this case with the scale=FALSE argument) scale(z, scale=FALSE) V1 V2 1 0.333 -2.7142857 2 NA 7.2857143 3 -0.667 -3.7142857 4 -6.667 NA 5 NA -0.7142857 6 1.333 1.2857143 7 3.333 -1.7142857 8 2.333 0.2857143 attr(,scaled:center) V1V2 9.67 10.714286 -- David. My intention was substracting means from z, so instictively I tried z-means V1 V2 1 0.333 -1.667 2 NA 7.2857143 3 -0.667 -2.667 4 -7.7142857 NA 5 NA 0.333 6 0.2857143 1.2857143 7 3.333 -0.667 8 1.2857143 0.2857143 But this is completely wrong. sapply() gives the same result: sapply(z, function(row) row - means) V1 V2 [1,] 0.333 -1.667 [2,] NA 7.2857143 [3,] -0.667 -2.667 [4,] -7.7142857 NA [5,] NA 0.333 [6,] 0.2857143 1.2857143 [7,] 3.333 -0.667 [8,] 1.2857143 0.2857143 So, what is going on here? The following appears to work z-matrix(means,ncol=2)[rep(1, dim(z)[1]),] V1 V2 1 0.333 -2.7142857 2 NA 7.2857143 3 -0.667 -3.7142857 4 -6.667 NA 5 NA -0.7142857 6 1.333 1.2857143 7 3.333 -1.7142857 8 2.333 0.2857143 but I think it's rather cumbersome, surely there must be a cleaner way to do it. -- Ernest __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sapply puzzlement
In addition to what has already been suggested you could use .. mapply(function(x,y) x-y, z,means) which returns V1 V2 [1,] 0.333 -2.7142857 [2,] NA 7.2857143 [3,] -0.667 -3.7142857 [4,] -6.667 NA [5,] NA -0.7142857 [6,] 1.333 1.2857143 [7,] 3.333 -1.7142857 [8,] 2.333 0.2857143 The results you see when you use the z-means approach are caused by the vectors being different lengths. The shorter one (means) is repeated. Phil Spector's book describes a nice example which illustrates the behaviour nicely. nums = 1:10 nums +c(1,2) HTH Pete -- View this message in context: http://r.789695.n4.nabble.com/sapply-puzzlement-tp3243520p3243583.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sapply puzzlement
R works by going down the columns. If you make the rows into columns, it then does what you want. You just have to make the columns back into rows to get the original shape of your matrix. So the code in one line is : t(t(z) - means) Original message Date: Fri, 28 Jan 2011 01:16:45 +0100 From: r-help-boun...@r-project.org (on behalf of nfdi...@gmail.com (Ernest Adrogué i Calveras)) Subject: [R] sapply puzzlement To: r-help@r-project.org Hi, I have this data.frame with two variables in it, z V1 V2 1 10 8 2 NA 18 3 9 7 4 3 NA 5 NA 10 6 11 12 7 13 9 8 12 11 and a vector of means, means - apply(z, 2, function (col) mean(na.omit(col))) means V1V2 9.67 10.714286 My intention was substracting means from z, so instictively I tried z-means V1 V2 1 0.333 -1.667 2 NA 7.2857143 3 -0.667 -2.667 4 -7.7142857 NA 5 NA 0.333 6 0.2857143 1.2857143 7 3.333 -0.667 8 1.2857143 0.2857143 But this is completely wrong. sapply() gives the same result: sapply(z, function(row) row - means) V1 V2 [1,] 0.333 -1.667 [2,] NA 7.2857143 [3,] -0.667 -2.667 [4,] -7.7142857 NA [5,] NA 0.333 [6,] 0.2857143 1.2857143 [7,] 3.333 -0.667 [8,] 1.2857143 0.2857143 So, what is going on here? The following appears to work z-matrix(means,ncol=2)[rep(1, dim(z)[1]),] V1 V2 1 0.333 -2.7142857 2 NA 7.2857143 3 -0.667 -3.7142857 4 -6.667 NA 5 NA -0.7142857 6 1.333 1.2857143 7 3.333 -1.7142857 8 2.333 0.2857143 but I think it's rather cumbersome, surely there must be a cleaner way to do it. -- Ernest __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Dario Strbenac Research Assistant Cancer Epigenetics Garvan Institute of Medical Research Darlinghurst NSW 2010 Australia __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SApply versus for loop for list of data.frames
On Oct 12, 2010, at 12:16 AM, rivercode wrote: Hi, I am trying to find the total number of rows for a list of data.frames and want to know if there is a better way than using a loop like: df = { list of data.frame with varying number of rows...each one has a column called COL } r = 0 for (i in 1:length(df)) { + r = r + length(n[[i]]$CON) + } r 6000123 number of rows. r - lapply(df, NROW) r -- David. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SApply versus for loop for list of data.frames
On Oct 12, 2010, at 12:33 AM, David Winsemius wrote: On Oct 12, 2010, at 12:16 AM, rivercode wrote: Hi, I am trying to find the total number of rows for a list of data.frames and want to know if there is a better way than using a loop like: df = { list of data.frame with varying number of rows...each one has a column called COL } r = 0 for (i in 1:length(df)) { + r = r + length(n[[i]]$CON) + } r 6000123 number of rows. r - lapply(df, NROW) r Rather: sum(unlist(r)) # or sum(sapply(df,NROW)) sum -- David. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sapply/lapply instead of loop
will this do what you want: newTemp[] - lapply(newTemp, function(.col){ + # convert to character and pad to 5 space + sprintf(%5s, as.character(.col)) + }) str(newTemp) 'data.frame': 5 obs. of 3 variables: $ DX1: chr 13761 63371 51745 64081 ... $ DX2: chr 8125 v75 77703 32826 ... $ DX3: chr 49178 22237 93500 v72 ... On Tue, Aug 10, 2010 at 2:55 PM, GL pfl...@shands.ufl.edu wrote: Using the input below, can I do something more elegant (and more efficient) than the loop also listed below to pad strings to a width of 5? The true matrix is about 300K rows and 31 columns. ### #INPUT ### temp DX1 DX2 DX3 1 13761 8125 49178 2 63371 v75 22237 3 51745 77703 93500 4 64081 32826 v72 5 78477 43828 87645 ### #CODE ### ssize - c(nrow(temp), ncol(temp)) aa - c(1:ssize[2]) aa - paste(DX, aa, sep = ) record - matrix(?, nrow = ssize, ncol = ssize[2]) colnames(record) - aa mm - 0 #for (j in 1:1) { for (j in 1:ssize[1]) { mm - j a - as.character(as.matrix(as.data.frame(temp[j,]))) len2 - sum(a != ?) mi - 0 for (k in 1:len2) { aa - a[k] a0 - 5 - nchar(aa) if (a0 0) { for (st in 1:a0) { aa - paste(aa, , sep = ) } } record[j, k] - aa } } ### #OUTPUT ### DX1 DX2 DX3 1 13761 8125 49178 2 63371 v75 22237 3 51745 77703 93500 4 64081 32826 v72 5 78477 43828 87645 -- View this message in context: http://r.789695.n4.nabble.com/sapply-lapply-instead-of-loop-tp2320265p2320265.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sapply/lapply instead of loop
Try this: formatC(as.matrix(temp)) On Tue, Aug 10, 2010 at 3:55 PM, GL pfl...@shands.ufl.edu wrote: Using the input below, can I do something more elegant (and more efficient) than the loop also listed below to pad strings to a width of 5? The true matrix is about 300K rows and 31 columns. ### #INPUT ### temp DX1 DX2 DX3 1 13761 8125 49178 2 63371 v75 22237 3 51745 77703 93500 4 64081 32826 v72 5 78477 43828 87645 ### #CODE ### ssize - c(nrow(temp), ncol(temp)) aa - c(1:ssize[2]) aa - paste(DX, aa, sep = ) record - matrix(?, nrow = ssize, ncol = ssize[2]) colnames(record) - aa mm - 0 #for (j in 1:1) { for (j in 1:ssize[1]) { mm - j a - as.character(as.matrix(as.data.frame(temp[j,]))) len2 - sum(a != ?) mi - 0 for (k in 1:len2) { aa - a[k] a0 - 5 - nchar(aa) if (a0 0) { for (st in 1:a0) { aa - paste(aa, , sep = ) } } record[j, k] - aa } } ### #OUTPUT ### DX1 DX2 DX3 1 13761 8125 49178 2 63371 v75 22237 3 51745 77703 93500 4 64081 32826 v72 5 78477 43828 87645 -- View this message in context: http://r.789695.n4.nabble.com/sapply-lapply-instead-of-loop-tp2320265p2320265.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sapply/lapply instead of loop
Both of those approaches seem to return ( v75) instead of (v75 ). -- View this message in context: http://r.789695.n4.nabble.com/sapply-lapply-instead-of-loop-tp2320265p2320305.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sapply/lapply instead of loop
So try: format(as.matrix(temp)) On Tue, Aug 10, 2010 at 4:13 PM, GL pfl...@shands.ufl.edu wrote: Both of those approaches seem to return ( v75) instead of (v75 ). -- View this message in context: http://r.789695.n4.nabble.com/sapply-lapply-instead-of-loop-tp2320265p2320305.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sapply/lapply instead of loop
That works great, and is ever so much simpler. Thanks much! -- View this message in context: http://r.789695.n4.nabble.com/sapply-lapply-instead-of-loop-tp2320265p2320317.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sapply or apply
Hello Roslina, Maybe it is just me, but I have difficulty picking apart what you are trying to do because, the data have the same names as the arguments in your functions, and when you create the function term(), you have two sets of arguments (for term() and for gam_sum() ) that have the same names in addition to data with the same names. You also rely on the arguments' locations rather than explicitly stating their names. My guess is that you are having problems because sapply() is passing data from 'bt' to the first argument of gam_sum(), (which is named 'alp' ), so you *might* not be using the data you think you are in the arguments you think you are. At any rate, I can tell you what is happening right now with term(). It is calling gam_sum() for each individual element of 'bt' like so: gam_sum(bt[1],alp,bt_min,1) gam_sum(bt[2],alp,bt_min,1) gam_sum(bt[3],alp,bt_min,1) gam_sum(bt[4],alp,bt_min,1) and taking the sum of all of these. Note that since the data 'alp' has 4 elements, gam_sum() also puts out 4 elements, for each element of the data 'bt' . Best regards, Josh On Wed, Jun 16, 2010 at 5:21 PM, Roslina Zakaria zrosl...@yahoo.com wrote: Hi r-users, I have this code here : dt - winter_pos_sum bt - c(24.96874, 19.67861, 23.51001, 19.86868); round(bt,2) alp - c(2.724234, 3.914649, 3.229146, 3.120719); round(alp,2) bt_min - min(bt) ; bt_min p - alp_sum ; p t - 50 t1 - t+1 #first get the sum over the eigenvalues for a particular power i gam_sum - function(alp,bt,bt_min,i) {alp*(1-bt_min/bt)^i} gam_sum(alp,bt,bt_min,1) 0.57718379+ 0.+ 0.52625031+ 0.02985377 = 1.133288 term - function(i,alp,bt,bt_min) {sum(sapply(bt,gam_sum , alp,bt_min,i))} term (1,alp,bt,bt_min) [1] -1817.765 I should get the value using term (1,alp,bt,bt_min)=1.133288. I have 4 values of beta and 4 values of alpha. Thank you so much for your help. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student Health Psychology University of California, Los Angeles __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sapply code
Try this: 1 + (1 / log(length(lambda_cor))) * sum((l - lambda_cor / length(lambda_cor)) * log(l)) On Sun, May 16, 2010 at 10:43 PM, Roslina Zakaria zrosl...@yahoo.comwrote: Hi r-users, I have this code here, but I just wonder how do I use 'sapply' to make it more efficient lamda_cor - eigen(winter_cor)$values lamda_cor [1] 1.3459066 1.0368399 0.8958128 0.7214407 lamda_cxn - function(dt) { n - length(dt) term- vector(length=n, mode=numeric) for (i in 1:n) { term[i] - (dt[i]/n)*log(dt[i]/n) } #sum(term) cxn - 1 + (1/log(n))*sum(term) cxn } lamda_cxn(lamda_cor) lamda_cxn(lamda_cor) [1] 0.01861457 Thank you so much for all helps given. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sapply, lattice functions
Dear R-gurus.. How do I implement the following: a) Overlay frequency(instead of density) with line of density plot, vertical lines of confidence intervals and reference levels? b) Control the breaks (using nint?), order of the panel, and the layout, place units for each conditioning variable? Is there more efficient way than lines provided below? Please feel free to suggest other ways of displaying the above information. library(reshape) # The data frame below is an example of a subset of results from bootstrap runs. From the results of bootstrap, I would like to # display frequency histogram with overlay of density,confidence intervals, and point estimates. aa - data.frame(id=seq(20),a1=rnorm(20),b1=rnorm(20,0.8),c1=rnorm(20,0.5)) aa1 - rbind(aa,c( -99, -0.02,1.09, 0.23)) # record of reference values for each distribution, set id = -99 ab - melt(aa1,measure=names(aa1)[-match(id,names(aa1)]) uns - unlist(list(id=,a1=mL/h,b1=L,c1=L)) uns1 - data.frame(var=names(uns),uni=(uns)) ab1 - merge(ab,uns1,by=variable,all.x=T) ab2 - ab1[order(ab1$variable,ab1$id),] histogram(~value|paste(as.character(variable), (,uni,),sep=),breaks=NULL,nint=10,ab2,layout=c(2,2),as.table=T,type=density,scales=list(relation='free'), panel=function(x,lqp=c(0.025,0.975),...) { # lqp indicate confidence intervals x1 - x[-1] vs - x[1] panel.histogram(x1,col='light blue',...) panel.densityplot(x1,col.line='blue',lwd=1.75,...) panel.abline(v=c(quantile(as.vector(x1),prob=lqp,na.rm = T),vs), col=c(blue,blue,dark green),lwd=2,lty=c(2,2,1))}, strip=strip.custom( strip.names=F, strip.levels=T) ) Thanks so much for your help! Santosh On Sat, Mar 20, 2010 at 10:56 AM, Sundar Dorai-Raj sdorai...@gmail.comwrote: You're right. It's necessary for xyplot though to prevent grouping. On Mar 20, 2010 10:43 AM, Dieter Menne dieter.me...@menne-biomed.de wrote: Sundar Dorai-Raj-2 wrote: Or perhaps more clearly, histogram(~a1 + b1 + c1, data = aa, o... Why outer=TRUE? Looks same for me without: Dieter library(lattice) aa - data.frame(a1=rnorm(20),b1=rnorm(20,0.8),c1=rnorm(20,0.5)) histogram(~a1 + b1 + c1, data = aa) -- View this message in context: http://n4.nabble.com/sapply-lattice-functions-tp1618134p1676043.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sapply, lattice functions
Sundar Dorai-Raj-2 wrote: Or perhaps more clearly, histogram(~a1 + b1 + c1, data = aa, outer = TRUE) Why outer=TRUE? Looks same for me without: Dieter library(lattice) aa - data.frame(a1=rnorm(20),b1=rnorm(20,0.8),c1=rnorm(20,0.5)) histogram(~a1 + b1 + c1, data = aa) -- View this message in context: http://n4.nabble.com/sapply-lattice-functions-tp1618134p1676043.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sapply, lattice functions
You're right. It's necessary for xyplot though to prevent grouping. On Mar 20, 2010 10:43 AM, Dieter Menne dieter.me...@menne-biomed.de wrote: Sundar Dorai-Raj-2 wrote: Or perhaps more clearly, histogram(~a1 + b1 + c1, data = aa, o... Why outer=TRUE? Looks same for me without: Dieter library(lattice) aa - data.frame(a1=rnorm(20),b1=rnorm(20,0.8),c1=rnorm(20,0.5)) histogram(~a1 + b1 + c1, data = aa) -- View this message in context: http://n4.nabble.com/sapply-lattice-functions-tp1618134p1676043.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sapply, lattice functions
Try this: junk - sapply(aa,function(x) print(histogram(x,breaks=NULL))) or, shorter: for(a in aa) print(histogram(a, breaks = NULL) On Fri, Mar 19, 2010 at 5:44 PM, Santosh santosh2...@gmail.com wrote: Dear R-gurus aa - data.frame(a1=rnorm(20),b1=rnorm(20,0.8),c1=rnorm(20,0.5)) sapply(aa,function(x) histogram(x,breaks=NULL)) or px - sapply(aa,function(x) histogram(x,breaks=NULL)) print(px,split=c(1,1,1,1),more=F) The above code does not seem to work. am I missing something? Thanks, Santosh [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sapply, lattice functions
Thanks for your response. how do I print them in an ordered manner, akin to using print(px,split=c(2,2,1,1),more=T)) or par(mfrow=c(x,y))? -Santosh On Fri, Mar 19, 2010 at 2:58 PM, Gabor Grothendieck ggrothendi...@gmail.com wrote: Try this: junk - sapply(aa,function(x) print(histogram(x,breaks=NULL))) or, shorter: for(a in aa) print(histogram(a, breaks = NULL) On Fri, Mar 19, 2010 at 5:44 PM, Santosh santosh2...@gmail.com wrote: Dear R-gurus aa - data.frame(a1=rnorm(20),b1=rnorm(20,0.8),c1=rnorm(20,0.5)) sapply(aa,function(x) histogram(x,breaks=NULL)) or px - sapply(aa,function(x) histogram(x,breaks=NULL)) print(px,split=c(1,1,1,1),more=F) The above code does not seem to work. am I missing something? Thanks, Santosh [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sapply, lattice functions
Try this: histogram(~ values | ind, stack(aa)) On Fri, Mar 19, 2010 at 5:44 PM, Santosh santosh2...@gmail.com wrote: Dear R-gurus aa - data.frame(a1=rnorm(20),b1=rnorm(20,0.8),c1=rnorm(20,0.5)) sapply(aa,function(x) histogram(x,breaks=NULL)) or px - sapply(aa,function(x) histogram(x,breaks=NULL)) print(px,split=c(1,1,1,1),more=F) The above code does not seem to work. am I missing something? Thanks, Santosh [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sapply, lattice functions
Or perhaps more clearly, histogram(~a1 + b1 + c1, data = aa, outer = TRUE) --sundar On Fri, Mar 19, 2010 at 3:50 PM, Gabor Grothendieck ggrothendi...@gmail.com wrote: Try this: histogram(~ values | ind, stack(aa)) On Fri, Mar 19, 2010 at 5:44 PM, Santosh santosh2...@gmail.com wrote: Dear R-gurus aa - data.frame(a1=rnorm(20),b1=rnorm(20,0.8),c1=rnorm(20,0.5)) sapply(aa,function(x) histogram(x,breaks=NULL)) or px - sapply(aa,function(x) histogram(x,breaks=NULL)) print(px,split=c(1,1,1,1),more=F) The above code does not seem to work. am I missing something? Thanks, Santosh [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Sapply
Try this: data$score - ave(data$score, data$group, FUN = prop.table) On Sun, Aug 30, 2009 at 6:08 PM, Noah Silvermann...@smartmediacorp.com wrote: Hi, I need a bit of guidance with the sapply function. I've read the help page, but am still a bit unsure how to use it. I have a large data frame with about 100 columns and 30,000 rows. One of the columns is group of which there are about 2,000 distinct groups. I want to normalize (sum to 1) one of my variables per-group. Normally, I would just write a huge for each loop, but have read that is hugely inefficient with R. The old way would be (just an example, syntax might not be perfect): for (group in data$group){ for (score in data[data$group == group]){ new_score - score / sum(data$score[data$group==group]) } } How would I simplify this with sapply? Thanks! -- Noah __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Sapply
On 30/08/2009 6:08 PM, Noah Silverman wrote: Hi, I need a bit of guidance with the sapply function. I've read the help page, but am still a bit unsure how to use it. I have a large data frame with about 100 columns and 30,000 rows. One of the columns is group of which there are about 2,000 distinct groups. I want to normalize (sum to 1) one of my variables per-group. Normally, I would just write a huge for each loop, but have read that is hugely inefficient with R. Don't believe what you read, try it. If the for loop takes 100 times longer than the fastest method, but it still only takes 10 seconds, is it worth optimizing? Duncan Murdoch The old way would be (just an example, syntax might not be perfect): for (group in data$group){ for (score in data[data$group == group]){ new_score - score / sum(data$score[data$group==group]) } } How would I simplify this with sapply? Thanks! -- Noah __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Sapply
On Sun, Aug 30, 2009 at 5:08 PM, Noah Silvermann...@smartmediacorp.com wrote: Hi, I need a bit of guidance with the sapply function. I've read the help page, but am still a bit unsure how to use it. I have a large data frame with about 100 columns and 30,000 rows. One of the columns is group of which there are about 2,000 distinct groups. I want to normalize (sum to 1) one of my variables per-group. Normally, I would just write a huge for each loop, but have read that is hugely inefficient with R. The old way would be (just an example, syntax might not be perfect): for (group in data$group){ for (score in data[data$group == group]){ new_score - score / sum(data$score[data$group==group]) } } It might be easier to use ddply from the plyr package. The command you want would be: data - ddply(data, group, transform, score = score / sum(score)) More information at http://had.co.nz/plyr. Hadley -- http://had.co.nz/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sapply() related query
On Jun 17, 2009, at 10:06 AM, Girish A.R. wrote: Hi folks, I'm trying to consolidate the outputs (of anova() and lrm()) from multiple runs of single-variable logistic regression. Here's how the output looks: y ~ x1 y ~ x2 y ~ x3 y ~ x4 Chi-Square 0.1342152 1.573538 1.267291 1.518200 d.f. 2 2 2 1 P0.9350946 0.4553136 0.5306538 0.2178921 R20.01003342 0.1272791 0.0954126 0.1184302 --- The problem I have is when there are a lot more variables (15+) --- It would be nice if this output is transposed. A reproducible code is included below. I tried the transpose function, but it didn't seem to work. If there is a neater way of getting the desired output, I'd appreciate that as well. === Lines - y x1 x2 x3 x4 0 m 1 0 7 1 t 2 1 13 0 f 1 2 18 1 t 1 2 16 1 f 3 0 16 0 t 3 1 16 0 t 1 1 16 0 t 2 1 16 1 t 3 2 14 0 t 1 0 9 0 t 1 0 10 1 m 1 0 4 0 f 2 2 18 1 f 1 1 12 0 t 2 0 13 0 t 1 1 16 1 t 1 2 7 0 f 2 1 18 my.data - read.table(textConnection(Lines), header = TRUE) my.data$x1 - as.factor(my.data$x1) my.data$x2 - as.factor(my.data$x2) my.data$x3 - as.factor(my.data$x3) my.data$y - as.logical(my.data$y) sapply(paste(y ~, names(my.data)[2:dim(my.data)[2]]), function(f){tab - cbind(as.data.frame(t(anova(lrm(as.formula(f),data = my.data,x=T,y=T))[1,])), as.data.frame(t(lrm(as.formula(f),data = my.data,x=T,y=T)$stats[10]))) }) = Thanks, - Girish You can try something like this: library(Design) my.func - function(x) { mod - lrm(my.data$y ~ x) data.frame(t(anova(mod)[1, ]), R2 = mod$stats[10]) } t(sapply(my.data[, -1], my.func)) Chi.Square d.f. P R2 x1 0.1342152 20.9350946 0.01003342 x2 1.573538 20.4553136 0.1272791 x3 1.267291 20.5306538 0.0954126 x4 1.518200 10.2178921 0.1184302 I am not sure what your end game might be, but would simply express the appropriate caution if this is a step in any approach to variable selection for subsequent model development... HTH, Marc Schwartz __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sapply() related query
Thanks, Marc! This is what I was looking for. best, -Girish PS: Also appreciate your concern about this being a part of a variable selection process. On Jun 17, 9:01 pm, Marc Schwartz marc_schwa...@me.com wrote: On Jun 17, 2009, at 10:06 AM, Girish A.R. wrote: Hi folks, I'm trying to consolidate the outputs (of anova() and lrm()) from multiple runs of single-variable logistic regression. Here's how the output looks: y ~ x1 y ~ x2 y ~ x3 y ~ x4 Chi-Square 0.1342152 1.573538 1.267291 1.518200 d.f. 2 2 2 1 P 0.9350946 0.4553136 0.5306538 0.2178921 R2 0.01003342 0.1272791 0.0954126 0.1184302 --- The problem I have is when there are a lot more variables (15+) --- It would be nice if this output is transposed. A reproducible code is included below. I tried the transpose function, but it didn't seem to work. If there is a neater way of getting the desired output, I'd appreciate that as well. === Lines - y x1 x2 x3 x4 0 m 1 0 7 1 t 2 1 13 0 f 1 2 18 1 t 1 2 16 1 f 3 0 16 0 t 3 1 16 0 t 1 1 16 0 t 2 1 16 1 t 3 2 14 0 t 1 0 9 0 t 1 0 10 1 m 1 0 4 0 f 2 2 18 1 f 1 1 12 0 t 2 0 13 0 t 1 1 16 1 t 1 2 7 0 f 2 1 18 my.data - read.table(textConnection(Lines), header = TRUE) my.data$x1 - as.factor(my.data$x1) my.data$x2 - as.factor(my.data$x2) my.data$x3 - as.factor(my.data$x3) my.data$y - as.logical(my.data$y) sapply(paste(y ~, names(my.data)[2:dim(my.data)[2]]), function(f){tab - cbind(as.data.frame(t(anova(lrm(as.formula(f),data = my.data,x=T,y=T))[1,])), as.data.frame(t(lrm(as.formula(f),data = my.data,x=T,y=T)$stats[10]))) }) = Thanks, - Girish You can try something like this: library(Design) my.func - function(x) { mod - lrm(my.data$y ~ x) data.frame(t(anova(mod)[1, ]), R2 = mod$stats[10]) } t(sapply(my.data[, -1], my.func)) Chi.Square d.f. P R2 x1 0.1342152 2 0.9350946 0.01003342 x2 1.573538 2 0.4553136 0.1272791 x3 1.267291 2 0.5306538 0.0954126 x4 1.518200 1 0.2178921 0.1184302 I am not sure what your end game might be, but would simply express the appropriate caution if this is a step in any approach to variable selection for subsequent model development... HTH, Marc Schwartz __ r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sapply
I'm not sure what you really want, so perhaps a simple example would help (i.e. what a sample of the input looks like and what the output you need looks like). My guess would be sapply(df, diff) but again, I'm not sure. --sundar On Sun, Feb 8, 2009 at 4:24 PM, glenn g1enn.robe...@btinternet.com wrote: Newbie question sorry (have tried the help pages I promise) I have a dataframe (date,stockprice) say and looking how I might get the return of: dataframe (difference in days, change in stock price) using sapply - I require a very simple function and don't really want to go down the zoo and quant mod route Regards glenn [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sapply
Bullseye ! thanks a lot Sundar -Original Message- From: Sundar Dorai-Raj [mailto:sdorai...@gmail.com] Sent: 09 February 2009 00:31 To: glenn Cc: r-help@r-project.org Subject: Re: [R] sapply I'm not sure what you really want, so perhaps a simple example would help (i.e. what a sample of the input looks like and what the output you need looks like). My guess would be sapply(df, diff) but again, I'm not sure. --sundar On Sun, Feb 8, 2009 at 4:24 PM, glenn g1enn.robe...@btinternet.com wrote: Newbie question sorry (have tried the help pages I promise) I have a dataframe (date,stockprice) say and looking how I might get the return of: dataframe (difference in days, change in stock price) using sapply - I require a very simple function and don't really want to go down the zoo and quant mod route Regards glenn [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sapply and median, possible or not ?
Unfortunately, I have the same error message. lapply(rowsplit, function(x)mean(x[,sapply(x, is.numeric)])) works but not with median. Strange, isn't it? Any other idea? Thanks in advance, Ptit Bleu. Henrique Dallazuanna wrote: Try this: lapply(l, function(x)median(x[,sapply(x, is.numeric)])) -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/sapply-and-median%2C-possible-or-not---tp20378222p20378663.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sapply and median, possible or not ?
You can provide a example of your data? On Fri, Nov 7, 2008 at 9:14 AM, Ptit_Bleu [EMAIL PROTECTED] wrote: Unfortunately, I have the same error message. lapply(rowsplit, function(x)mean(x[,sapply(x, is.numeric)])) works but not with median. Strange, isn't it? Any other idea? Thanks in advance, Ptit Bleu. Henrique Dallazuanna wrote: Try this: lapply(l, function(x)median(x[,sapply(x, is.numeric)])) -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/sapply-and-median%2C-possible-or-not---tp20378222p20378663.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sapply and median, possible or not ?
I haven't looked at the detail, but I guess the answer is that mean works on a data frame while median doesn't. ?mean snip For a data frame, a named vector with the appropriate method being applied column by column. - I guess to use median you'll need nested '[l/s]apply's, the outer working through the list of dataframes and the inner working through the columns of each dataframe. Or perhaps, by analogy with mean.data.frame you could just define - median.data.frame - function(x, ...) sapply(x, median, ...) I haven't tried it, but it might work hth Keith J - Ptit_Bleu [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED] Unfortunately, I have the same error message. lapply(rowsplit, function(x)mean(x[,sapply(x, is.numeric)])) works but not with median. Strange, isn't it? Any other idea? Thanks in advance, Ptit Bleu. Henrique Dallazuanna wrote: Try this: lapply(l, function(x)median(x[,sapply(x, is.numeric)])) -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/sapply-and-median%2C-possible-or-not---tp20378222p20378663.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.