[R] standardization

2007-07-13 Thread Amir_17
Hi
  I have dataframe which contain 5 columns and 1000 records. I want standard 
each cell.
  I want range each column  between 0 and 1 . I think i must use loop?
  could you help me?

   
-
Moody friends. Drama queens. Your life? Nope! - their life, your story.

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] standardization

2007-07-13 Thread Prof Brian Ripley
On Fri, 13 Jul 2007, David Barron wrote:

 Try having a look at the scale and sweep functions.

sweep applies to arrays, not data frames, and scale converts to a matrix.

For a data frame

df2 - df1
df2[] - lapply(df1, function(x) {r - range(x, na.rm=TRUE);
   (x-r[1])/diff(r)})

seems simple enough.

 On 13/07/07, Amir_17 [EMAIL PROTECTED] wrote:
 Hi
   I have dataframe which contain 5 columns and 1000 records. I want standard 
 each cell.
   I want range each column  between 0 and 1 . I think i must use loop?
   could you help me?

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] standardization

2007-07-13 Thread David Barron
Try having a look at the scale and sweep functions.

David

On 13/07/07, Amir_17 [EMAIL PROTECTED] wrote:
 Hi
   I have dataframe which contain 5 columns and 1000 records. I want standard 
 each cell.
   I want range each column  between 0 and 1 . I think i must use loop?
   could you help me?


 -
 Moody friends. Drama queens. Your life? Nope! - their life, your story.

 [[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
=
David Barron
Said Business School
University of Oxford
Park End Street
Oxford OX1 1HP

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Standardization Range

2007-03-28 Thread Sergio Della Franca
Dear R-Helpers,


I want to perform a standardization of a variable with range method.

i.e.:

 Standardization (range) == (var-min(var))/(max(var)-min(var))


Do you konw how can i develop this?

Thank you in advance.


Sergio Della Franca

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Standardization Range

2007-03-28 Thread Stéphane Dray
Hi sergio,
Sergio Della Franca wrote:
 Dear R-Helpers,


 I want to perform a standardization of a variable with range method.

 i.e.:

  Standardization (range) == (var-min(var))/(max(var)-min(var))


 Do you konw how can i develop this?

   
As you do ... but don't use var which is the name of the function to 
compute variance.

Try something like :

  stdrange - function(x) {(x-min(x))/(max(x)-min(x))}
  var=1:10 # not a good idea, just for fun
  stdrange(var)
 [1] 0.000 0.111 0.222 0.333 0.444 0.556 0.667
 [8] 0.778 0.889 1.000

 Thank you in advance.


 Sergio Della Franca

   [[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


   


-- 
Stéphane DRAY ([EMAIL PROTECTED] )
Laboratoire BBE-CNRS-UMR-5558, Univ. C. Bernard - Lyon I
43, Bd du 11 Novembre 1918, 69622 Villeurbanne Cedex, France
Tel: 33 4 72 43 27 57   Fax: 33 4 72 43 13 88
http://biomserv.univ-lyon1.fr/~dray/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Standardization Range

2007-03-28 Thread Dimitris Rizopoulos
you can still use scale() (as you have been told), look at the help 
page for more info, especially at the Arguments section, e.g.,


mat - matrix(rnorm(100*10), 100, 10)

rng - apply(mat, 2, range)
scale(mat, scale = rng[2, ] - rng[1, ])

or you could even use apply() directly, e.g.,

apply(mat, 2, function(x) (x - mean(x)) / diff(range(x)) )


I hope it helps.

Best,
Dimitris


Dimitris Rizopoulos
Ph.D. Student
Biostatistical Centre
School of Public Health
Catholic University of Leuven

Address: Kapucijnenvoer 35, Leuven, Belgium
Tel: +32/(0)16/336899
Fax: +32/(0)16/337015
Web: http://med.kuleuven.be/biostat/
 http://www.student.kuleuven.be/~m0390867/dimitris.htm


- Original Message - 
From: Sergio Della Franca [EMAIL PROTECTED]
To: r-help@stat.math.ethz.ch
Sent: Wednesday, March 28, 2007 11:00 AM
Subject: [R] Standardization Range


 Dear R-Helpers,


 I want to perform a standardization of a variable with range method.

 i.e.:

 Standardization (range) == (var-min(var))/(max(var)-min(var))


 Do you konw how can i develop this?

 Thank you in advance.


 Sergio Della Franca

 [[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 


Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Standardization

2007-03-27 Thread Sergio Della Franca
Dear R-Helpers,

I want to perform a stadardiazation of a variable with mehtod range.

How can i achve this results?


Thank you in advance.


Sergio Della Franca

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Standardization

2007-03-27 Thread Marc Schwartz
On Tue, 2007-03-27 at 16:52 +0200, Sergio Della Franca wrote:
 Dear R-Helpers,
 
 I want to perform a stadardiazation of a variable with mehtod range.
 
 How can i achve this results?
 
 
 Thank you in advance.
 
 
 Sergio Della Franca

See ?scale

HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Standardization

2007-03-27 Thread Bos, Roger
I am not sure I understand your question, but you may want to have a
look at ?scale.  It might get you started.

 

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Sergio Della
Franca
Sent: Tuesday, March 27, 2007 10:52 AM
To: r-help@stat.math.ethz.ch
Subject: [R] Standardization

Dear R-Helpers,

I want to perform a stadardiazation of a variable with mehtod range.

How can i achve this results?


Thank you in advance.


Sergio Della Franca

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

** * 
This message is for the named person's use only. It may 
contain confidential, proprietary or legally privileged 
information. No right to confidential or privileged treatment 
of this message is waived or lost by any error in 
transmission. If you have received this message in error, 
please immediately notify the sender by e-mail, 
delete the message and all copies from your system and destroy 
any hard copies. You must not, directly or indirectly, use, 
disclose, distribute, print or copy any part of this message 
if you are not the intended recipient.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Standardization

2007-03-27 Thread Sergio Della Franca
Sorry,

I try to explain better my problem.

Standardization (range) == (var-mean(var))/(max(var)-min(var))

Thank you in advance




2007/3/27, Bos, Roger [EMAIL PROTECTED]:

 I am not sure I understand your question, but you may want to have a
 look at ?scale.  It might get you started.



 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] On Behalf Of Sergio Della
 Franca
 Sent: Tuesday, March 27, 2007 10:52 AM
 To: r-help@stat.math.ethz.ch
 Subject: [R] Standardization

 Dear R-Helpers,

 I want to perform a stadardiazation of a variable with mehtod range.

 How can i achve this results?


 Thank you in advance.


 Sergio Della Franca

[[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 ** *
 This message is for the named person's use only. It may
 contain confidential, proprietary or legally privileged
 information. No right to confidential or privileged treatment
 of this message is waived or lost by any error in
 transmission. If you have received this message in error,
 please immediately notify the sender by e-mail,
 delete the message and all copies from your system and destroy
 any hard copies. You must not, directly or indirectly, use,
 disclose, distribute, print or copy any part of this message
 if you are not the intended recipient.
 **


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Standardization

2007-03-27 Thread hadley wickham
On 3/27/07, Sergio Della Franca [EMAIL PROTECTED] wrote:
 Dear R-Helpers,

 I want to perform a stadardiazation of a variable with mehtod range.

 How can i achve this results?

One way is the rescaler method in the reshape package.  It can scale
to common range, mean 0 sd 1, or ranks.  Compared to scale, which
others have mentioned, it will work on data.frames, leaving
categorical variables unchanged.

Regards,

Hadley

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Standardization

2007-03-27 Thread Giovanni Petris

So, scale() is the answer. Have you looked at the help? 

Giovanni  

 Date: Tue, 27 Mar 2007 17:37:08 +0200
 From: Sergio Della Franca [EMAIL PROTECTED]
 Sender: [EMAIL PROTECTED]
 Cc: r-help@stat.math.ethz.ch
 Precedence: list
 
 Sorry,
 
 I try to explain better my problem.
 
 Standardization (range) == (var-mean(var))/(max(var)-min(var))
 
 Thank you in advance
 
 
 
 
 2007/3/27, Bos, Roger [EMAIL PROTECTED]:
 
  I am not sure I understand your question, but you may want to have a
  look at ?scale.  It might get you started.
 
 
 
  -Original Message-
  From: [EMAIL PROTECTED]
  [mailto:[EMAIL PROTECTED] On Behalf Of Sergio Della
  Franca
  Sent: Tuesday, March 27, 2007 10:52 AM
  To: r-help@stat.math.ethz.ch
  Subject: [R] Standardization
 
  Dear R-Helpers,
 
  I want to perform a stadardiazation of a variable with mehtod range.
 
  How can i achve this results?
 
 
  Thank you in advance.
 
 
  Sergio Della Franca
 
 [[alternative HTML version deleted]]
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
  ** *
  This message is for the named person's use only. It may
  contain confidential, proprietary or legally privileged
  information. No right to confidential or privileged treatment
  of this message is waived or lost by any error in
  transmission. If you have received this message in error,
  please immediately notify the sender by e-mail,
  delete the message and all copies from your system and destroy
  any hard copies. You must not, directly or indirectly, use,
  disclose, distribute, print or copy any part of this message
  if you are not the intended recipient.
  **
 
 
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

-- 

Giovanni Petris  [EMAIL PROTECTED]
Associate Professor
Department of Mathematical Sciences
University of Arkansas - Fayetteville, AR 72701
Ph: (479) 575-6324, 575-8630 (fax)
http://definetti.uark.edu/~gpetris/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] standardization of values before call to pam() or clara()

2006-06-03 Thread Martin Maechler
 Dylan == Dylan Beaudette [EMAIL PROTECTED]
 on Mon, 22 May 2006 17:33:47 -0700 writes:

Dylan Greetings, Experimenting with the cluster package,
Dylan and am starting to scratch my head in regards to the
Dylan *best* way to standardize my data. Both functions can
Dylan pre-standardize columns in a dataframe. according to
Dylan the manual:

Dylan Measurements are standardized for each variable
Dylan (column), by subtracting the variable's mean value
Dylan and dividing by the variable's mean absolute
Dylan deviation.

Dylan This works well when input variables are all in the
Dylan same units. When I include new variables with a
Dylan different intrinsic range, the ones with the largest
Dylan relative values tend to be _weighted_ . this is
Dylan certainly not surprising, but complicates things.

Dylan Does there exist a robust technique to effectively
Dylan re-scale each of the variables, regardless of their
Dylan intrinsic range to some set range, say from {0,1} ?

Dylan I have tried dividing a variable by the maximum value
Dylan of that variable, but I am not sure if this is
Dylan statistically correct.

A more usual scaling standardization is accomplished by the
function -- guess what? -- scale()

It defaults to standardize to mean 0 and std. 1.
But you can use it as well to do a [0,1] scaling.

Note that you are very wise to think about the importance of
variable scaling / weighting for cluster analysis.
But people have been here before, and invented the much more
general notion of a distance/dissimilarity between observational
units.
-- function  daisy() {in cluster} or  dist() {from stats}
provide such dissimilarity objects.
These can be used as input for  pam() or clara() as well,
and in constructing them you are much more flexible than trying
to find a proper scaling of your x-matrix.

Note that daisy() in particular has been designed for computing
sensible dissimilarities for the case when X-matrix has a
collection of continuous {eg interval scaled} and of
categorical (e.g binary) variables.

I recommend you get a textbook on clustering, to read up more on
the subject.

Regards, 
Martin Maechler, ETH Zurich


Dylan Any ideas, thoughts would be greatly appreciated.

Dylan Cheers,

Dylan -- Dylan Beaudette Soils and Biogeochemistry Graduate
Dylan Group University of California at Davis 530.754.7341

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] standardization of values before call to pam() or clara()

2006-05-22 Thread Dylan Beaudette
Greetings,

Experimenting with the cluster package, and am starting to scratch my head in 
regards to the *best* way to standardize my data. Both functions can 
pre-standardize columns in a dataframe. according to the manual:

Measurements are standardized for each variable (column), by subtracting the 
variable's mean value and dividing by the variable's mean absolute deviation. 

This works well when input variables are all in the same units. When I include 
new variables with a different intrinsic range, the ones with the largest 
relative values tend to be _weighted_ . this is certainly not surprising, but 
complicates things. 

Does there exist a robust technique to effectively re-scale each of the 
variables, regardless of their intrinsic range to some set range, say from 
{0,1} ?

I have tried dividing a variable by the maximum value of that variable, but I 
am not sure if this is statistically correct. 

Any ideas, thoughts would be greatly appreciated.

Cheers,

-- 
Dylan Beaudette
Soils and Biogeochemistry Graduate Group
University of California at Davis
530.754.7341

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] standardization

2005-05-18 Thread Philip Bermingham
SAS Enterprise Miner recommendeds to standardize using X / STDEV(X) 
versus [X  mean(X)] / STDEV(X)

Any thoughts on this? Pros Cons
Philip
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] standardization

2005-05-18 Thread Peter Dalgaard
Philip Bermingham [EMAIL PROTECTED] writes:

 SAS Enterprise Miner recommendeds to standardize using X / STDEV(X)
 versus [X – mean(X)] / STDEV(X)
 
 Any thoughts on this? Pros Cons

When??? 

This makes absolutely no sense out of context.

-- 
   O__   Peter Dalgaard Blegdamsvej 3  
  c/ /'_ --- Dept. of Biostatistics 2200 Cph. N   
 (*) \(*) -- University of Copenhagen   Denmark  Ph: (+45) 35327918
~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] standardization

2005-05-18 Thread Barry Rowlingson
Peter Dalgaard wrote:
SAS Enterprise Miner recommendeds to standardize using X / STDEV(X)
versus [X  mean(X)] / STDEV(X)

This makes absolutely no sense out of context.
 To paraphrase Tanenbaum: The nice thing about standardization is that 
there's so many ways to do it.

Baz
[[
Free On-line Dictionary of Computing:
Andrew Tanenbaum, in his Computer Networks book, once said,
The nice thing about standards is that there are so many of
them to choose from,
]]
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] standardization

2005-05-18 Thread Peter Dalgaard
Barry Rowlingson [EMAIL PROTECTED] writes:


 The nice thing about standards is that there are so many of
 them to choose from,

Curiously enough, the same quote came up today on dk.edb.system.unix
in the context of translations. 

-- 
   O__   Peter Dalgaard Blegdamsvej 3  
  c/ /'_ --- Dept. of Biostatistics 2200 Cph. N   
 (*) \(*) -- University of Copenhagen   Denmark  Ph: (+45) 35327918
~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] standardization

2005-05-18 Thread TEMPL Matthias
My thoughts on this is:
Do not trust what SAS say´s and least of all what the Enterprise Miner said.

Robust Statisticians recommendends to standardize using e.g. 
(X - median(X)) / ( MAD(X) / 0.675 ) 

Best,
Matthias


 SAS Enterprise Miner recommendeds to standardize using X / STDEV(X) 
 versus [X - mean(X)] / STDEV(X)
 
 Any thoughts on this? Pros Cons
 
 Philip
 
 __
 R-help@stat.math.ethz.ch mailing list 
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read 
 the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] standardization

2005-05-18 Thread bogdan romocea
You asked another question about clustering, so I presume you want to
standardize some variables before clustering. In SAS, PROC STDIZE
offers 18 standardization methods. See
http://support.sas.com/91doc/getDoc/statug.hlp/stdize_sect12.htm#stat_stdize_stdizesm
for details. If you're really concerned about this I would suggest
running simulations to compare the performance of various
standardization methods (relative to your data and what you're after).

hth,
b.


-Original Message-
From: Philip Bermingham [mailto:[EMAIL PROTECTED]
Sent: Wednesday, May 18, 2005 8:34 AM
To: r-help@stat.math.ethz.ch
Subject: [R] standardization


SAS Enterprise Miner recommendeds to standardize using X / STDEV(X) 
versus [X  mean(X)] / STDEV(X)

Any thoughts on this? Pros Cons

Philip

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html