I wish has simpler solution, apprently simple problem ! thanks for help.

On Fri, Mar 18, 2011 at 10:04 AM, jim holtman <jholt...@gmail.com> wrote:

> I think it was suggested that you save your output to a 'list' and
> then you will have it in a format that can accept variable numbers of
> items in each element and it is also in a form that you can easily
> process it to create whatever other output you might need.
>
> On Fri, Mar 18, 2011 at 7:24 AM, Ram H. Sharma <sharma.ra...@gmail.com>
> wrote:
> > Hi Dennis and R-users
> >
> > Thank you for more help. I am pretty close, but challenge still remain is
> > forcing the output with different length to output dataframe.
> >
> >> x <- data.frame(apply(datafr1, 2, fout))
> > Error in data.frame(var1 = c(-0.70777998321315, 0.418602152926712,
> > 2.08356737154810,  :
> >  arguments imply differing number of rows: 28, 12, 20, 19
> >
> > As I need to work with >2000 variables, my intension here is to save this
> > output to such way that it would be further manipulated. Topline is to
> save
> > in dataframe that have extreme values for the variable concerned and
> > bottomline is automate to save the output printed in the screen to a
> > textfile.
> >
> > Thank you for help once again.
> >
> > Ram
> >
> >
> > On Fri, Mar 18, 2011 at 3:16 AM, Dennis Murphy <djmu...@gmail.com>
> wrote:
> >
> >> Hi:
> >>
> >> Is this what you're after?
> >>
> >> fout <- function(x) {
> >>      lim <- median(x) + c(-2, 2) * mad(x)
> >>      x[x < lim[1] | x > lim[2]]
> >>    }
> >> > apply(datafr1, 2, fout)
> >> $var1
> >>  [1] 17.5462078 18.4548214  0.7083442  1.9207578 -1.2296787 17.4948240
> >>  [7] 19.5702558  1.6181150 20.9791652 -1.3542099  1.8215087 -1.0296303
> >> [13] 20.5237930 17.5366497 18.5657566  0.9335419 19.7519983 17.8607968
> >> [19] 19.1307524 19.6145711 21.8037136 19.1532175 -2.6688409 19.6949309
> >> [25] 1.9712347
> >>
> >> $var2
> >>  [1]  37.3822087  35.6490641  35.6000785  38.5981086  -1.6504275
> >> 37.1419290
> >>  [7]  37.7605230  40.3508689   0.6639900   2.4695841  38.8209491
> >> 39.9087921
> >> [13]  38.9907585  35.8279437   2.7870799  37.0941113   0.6308583
> >> 36.4556638
> >> [19] -10.2384849   2.8480199  -7.7680457  35.7076539  -0.5467739
> >> 3.4702765
> >> [25]  40.4818580   3.2864273   1.4917174
> >>
> >> $var3
> >>  [1]  74.252563  68.396391  68.845461  -5.006545  66.083402  76.036577
> >>  [7]  75.112586  -6.374241  63.883549  64.041216 -19.764360 -15.051017
> >> [13]  -9.782767  64.696013  70.970648  -4.562031 -22.135003  70.549310
> >> [19]  69.495915  -4.095587  86.612375  87.029526  70.072126  -6.421695
> >> [25] 65.737536
> >>
> >> $var4
> >>  [1]  81.476483  87.098767 -10.451616  91.927329  86.588952  85.080950
> >>  [7]  84.958645  -9.456368  86.270876 -22.936779  83.314032
> >>
> >> Double checks:
> >> > apply(datafr1, 2, function(x) median(x) + c(-2, 2) * mad(x))
> >>          var1      var2      var3      var4
> >> [1,]  2.12167  3.779415 -3.736066 -3.471752
> >> [2,] 17.37176 34.929800 62.969733 80.224799
> >> > apply(datafr1, 2, range)
> >>           var1      var2      var3      var4
> >> [1,] -2.668841 -10.23848 -22.13500 -22.93678
> >> [2,] 21.803714  40.48186  87.02953  91.92733
> >>
> >> Assuming you wanted to do this columnwise (by variable), it appears to
> be
> >> doing the right thing.
> >>
> >> HTH,
> >> Dennis
> >>
> >>
> >> On Thu, Mar 17, 2011 at 7:04 PM, Ram H. Sharma <sharma.ra...@gmail.com
> >wrote:
> >>
> >>> Dear R community members
> >>>
> >>> I have been struggling on this simple question, but never get
> appropriate
> >>> solution. So please help.
> >>>
> >>>  # my data, though I have a large number of variables
> >>> var1 <- rnorm(500, 10,4)
> >>> var2 <- rnorm(500, 20, 8)
> >>> var3 <- rnorm(500, 30, 18)
> >>> var4 <- rnorm(500, 40, 20)
> >>> datafr1 <- data.frame(var1, var2, var3, var4)
> >>>
> >>> # my unsuccessful codes
> >>>  nvar <- ncol(datafr1)
> >>> for (i in 1:nvar) {
> >>>              out1 <- NULL
> >>>              out2 <- NULL
> >>>              medianx <- median(getdata[,i], na.rm = TRUE)
> >>>              show(madx <- mad(getdata[,i], na.rm = TRUE))
> >>>              MD1 <- c(medianx + 2*madx)
> >>>              MD2 <- c(medianx - 2*madx)
> >>>              out1[i] <- which(getdata[,i] > MD1) # store data that are
> >>> greater than median + 2 mad
> >>>              out2[i] <- which (getdata[,1] < MD2) # store data that are
> >>> greater than median - 2 mad
> >>>             resultdf <- data.frame(out1, out2)
> >>>             write.table (resultdf, "out.csv", sep=",")
> >>>              }
> >>>
> >>>
> >>> My idea here is to store those value which are either greater than
> median
> >>> +
> >>> 2 *MAD or less than median - 2*MAD. Each variable have different length
> of
> >>> output.
> >>>
> >>> The following last error message:
> >>> Error in data.frame(out1, out2) :
> >>>  arguments imply differing number of rows: 2, 0
> >>> In addition: Warning messages:
> >>> 1: In out1[i] <- which(getdata[, i] > MD1) :
> >>>  number of items to replace is not a multiple of replacement length
> >>> 2: In out2[i] <- which(getdata[, 1] < MD2) :
> >>>  number of items to replace is not a multiple of replacement length
> >>> 3: In out1[i] <- which(getdata[, i] > MD1) :
> >>>  number of items to replace is not a multiple of replacement length
> >>>
> >>> Thank you in advance for helping me.
> >>>
> >>> Best regards;
> >>> RHS
> >>>
> >>>        [[alternative HTML version deleted]]
> >>>
> >>> ______________________________________________
> >>> R-help@r-project.org mailing list
> >>> https://stat.ethz.ch/mailman/listinfo/r-help
> >>> PLEASE do read the posting guide
> >>> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
> >>> and provide commented, minimal, self-contained, reproducible code.
> >>>
> >>
> >>
> >
> >        [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
>
>
> --
> Jim Holtman
> Data Munger Guru
>
> What is the problem that you are trying to solve?
>

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to