Hi, fellow R users. I have a question about sapply and split combination.
I have a big dataframe (40000 observations, 21 variables). First variable (factor) is "date" and it is in format "8.29.97", that is, I have monthly data. Second variable (also factor) has levels 1 to 6 (fractiles 1 to 5 and missing value with code 6). The other 19 variables are numeric. For each month I have several hunder observations of 19 numeric and 1 factor. I am normalizing the numeric variables by dividing val1 by val2, where: val1: (for each month, for each numeric variable) difference between mean of ith numeric variable in fractile 1, and mean of ith numeric variable in fractile 5. val2: (for each month, for each numeric variable) standard deviation for ith numeric variable. Basically, as far as I understand, I need to use split() function several times. To calculate val1 I need to use split() twice - first to split by month and then split by fractile. Is this even possible to do (since after first application of split() I get a list)?? Is there a smart way to perform this normalization computation? My knowledge of R is not so advanced, but I need to know an efficient way to perform calculations of this kind. Would really appreciate some help from experienced R users! Regards, S -- Laziness is nothing more than the habit of resting before you get tired. - Jules Renard (writer) Experience is one thing you can't get for nothing. - Oscar Wilde (writer) When you are finished changing, you're finished. - Benjamin Franklin (Diplomat) ______________________________________________ [email protected] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
