[R] Selective transformation

2007-08-21 Thread Allan Kamau
I am looking for a way to transform select observations based on a value based 
criteria.
Why - Am learning r and would like to perform regression analysis of given 
variables of the babies dataset (part of UsingR) for example babies$wt1, the 
data in the variables does contain values which should be interpreted as 
unknown, some variables have 999 for unknown and some have 99 for the same, 
since lm() expects not available data to be marked using NA.
I would like to use a solution that does not employ loops (I think it may not 
be the ideal way)

I am looking at using apply() and supply the name of my function responsible 
for transformation, but am unable to know now to reference the element of the 
vector/list being currently processed by apply() so I may do in place 
substitution (if value is 99 or 999) of the value with NA.






   

Pinpoint customers who are looking for what you sell.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Selective transformation

2007-08-21 Thread Allan Kamau


- Original Message 
From: Chuck Cleland [EMAIL PROTECTED]
To: Allan Kamau [EMAIL PROTECTED]
Cc: r-help@stat.math.ethz.ch
Sent: Tuesday, August 21, 2007 6:48:53 PM
Subject: Re: [R] Selective transformation

Allan Kamau wrote:
 I am looking for a way to transform select observations based on a value 
 based criteria.
 Why - Am learning r and would like to perform regression analysis of given 
 variables of the babies dataset (part of UsingR) for example babies$wt1, the 
 data in the variables does contain values which should be interpreted as 
 unknown, some variables have 999 for unknown and some have 99 for the same, 
 since lm() expects not available data to be marked using NA.
 I would like to use a solution that does not employ loops (I think it may not 
 be the ideal way)
 
 I am looking at using apply() and supply the name of my function responsible 
 for transformation, but am unable to know now to reference the element of the 
 vector/list being currently processed by apply() so I may do in place 
 substitution (if value is 99 or 999) of the value with NA.

  Does this do what you want?

babies$wt1 - with(babies, replace(wt1, wt1 == 999, NA))

?replace




Thanks Chuck, the replace command is just what I was looking for.

wt1-babies$wt1
wt1-replace(wt1,wt1==999,NA)

I get NA in wt1 vector in place of 999




 
 Pinpoint customers who are looking for what you sell.
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 
Chuck Cleland, Ph.D.
NDRI, Inc.
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 512-0171 (M, W, F)
fax: (917) 438-0894






  

Luggage? GPS? Comic books?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Matrix nesting (was Re: Obtaining summary of frequencies of value occurrences for a variable in a multivariate dataset.)

2007-07-30 Thread Allan Kamau
Hi







!--
@page { size: 21cm 29.7cm; margin: 2cm }
P { margin-bottom: 0.21cm }
--


I would like to nest matrices, is there
a way of doing so, I am getting “number of items to replace is not
a multiple of replacement length” errors (probably R is trying to
flatten the matrix into a vector and complains if the vector is
larger than 1 element during the insert)

I have a matrix (see below) in which I
would like to place one other matrices in to each k[2,i] position
(where i is value between 1 to 4)

Why – each value in k[1,i] may
represent several (1or more) key-value results which I would like to
capture in the corresponding k[2,i] element.





k

[,1]   [,2]   [,3]  
[,4]

myVariableNames PR10 PR11
PR12 PR13

x2  00
   00









Allan.



- Original Message 
From: Allan Kamau [EMAIL PROTECTED]
To: jim holtman [EMAIL PROTECTED]
Cc: r-help@stat.math.ethz.ch
Sent: Saturday, July 28, 2007 2:48:47 PM
Subject: Re: [R] Obtaining summary of frequencies of value occurrences for a 
variable in a multivariate dataset.

Hi Jim,
The problem description.
I am trying to identify mutations in a given gene from
a particular genome (biological genome sequence).
I have two CSV files consisting of sequences. One file
consists of reference (documented,curated accepted as
standard) sequences. The other consists of sample
sequences I am trying to identify mutations within. In
both files the an individual sequence is contained in
a single record, it’s amino acid residues ( the actual
sequence of alphabets each representing a given amino
acid for example “A” stands for “Alanine”, “C” for
Cysteine and so on) are each allocated a single field
in the CSV file.
The sequences in both files have been well aligned,
each contain 115 residues with the first residue is
contained in the field 5. The fields 1 to 4 are
allocated for metadata (name of sequence and so on).
My task is to compile a residue occurrence count for
each residue present in a given field in the reference
sequence dataset and use this information when reading
each sequence in the sample dataset to identify a
mutation. For example for position 9 of the sample
sequence “bb” a “P” is found and according to our
reference sequence dataset of summaries, at position 9
“P” may not even exist or may have an occurrence of
10% or so will be classified as mutation, (I could
employ a cut of parameter for mutation
classification).


Allan.

--- jim holtman [EMAIL PROTECTED] wrote:

 results=()#character()
 myVariableNames=names(x.val)
 results[length(myVariableNames)]-NA
 
 for (i in myVariableNames){
 results[i]-names(x.val[[i]])# this does not
 work it returns a
 NULL (how can i convert this to x.val$somevalue ?
 )
 }
 
 
 
 On 7/27/07, Allan Kamau [EMAIL PROTECTED]
 wrote:
  Hi All,
  I am having difficulties finding a way to find a
 substitute to the command names(v.val$PR14) so
 that I could generate the command on the fly for all
 PR14 to PR200 (please see the previous discussion
 below to understand what the object x.val contains)
 . I have tried the following
 
  results=()#character()
  myVariableNames=names(x.val)
  results[length(myVariableNames)]-NA
 
  for
 as.vector(unlist(strsplit(str,,)),mode=list)
  +results[i]-names(x.val$i)# this does not
 work it returns a NULL (how can i convert this to
 x.val$somevalue ? )
  }
 
  Allan.
 
 
  - Original Message 
  From: Allan Kamau [EMAIL PROTECTED]
  To: r-help@stat.math.ethz.ch
  Sent: Thursday, July 26, 2007 10:03:17 AM
  Subject: Re: [R] Obtaining summary of frequencies
 of value occurrences for a variable in a
 multivariate dataset.
 
  Thanks so much Jim, Andaikalavan, Gabor and others
 for the help and suggestions.
  The solution will result in a matrix containing
 nested matrices to enable each variable name, each
 variables distinct value and the count of the
 distinct value to be accessible individually.
  The main matrix will contain the variable names,
 the first level nested matrices will consist of the
 variables unique values, and each such variable
 entry will contain a one element vector to contain
 the count or occurrence frequency.
  This matrix can now be used in comparing other
 similar datasets for variable values and their
 frequencies.
 
  Building on the input received so far, a probable
 solution in building the matrix will include the
 following.
 
 
  1)I reading the csv file (containing column
 headers)
 

my_data=read.table(path/to/my/data.csv,header=TRUE,sep=,,dec=.,fill=TRUE)
 
  2)I group the values in each variable producing an
 occurrence count(frequency)
  x.val-apply(my_data,2,table)
 
  3)I obtain a vector of the names of the variables
 in the table
  names(x.val)
 
  4)Now I make use of the names (obtained in step 3)
 to obtain a vector of distinct values in a given
 variable (in the example below the variable name

Re: [R] Matrix nesting (was Re: Obtaining summary of frequencies of value occurrences for a variable in a multivariate dataset.)

2007-07-30 Thread Allan Kamau
Success, thanks Patrick. Below is the final matrix construction code.

x=list()
x[length(myVariableNames)]-NA
names(x)-names(x.val)
for (i in myVariableNames){
residues=names(x.val[[i]])
residuesFrequencies=as.vector(x.val[[i]])
someList=list()
names(residuesFrequencies)=residues

someList-list(frequency=residuesFrequencies)
x[i]-someList
}

#The output

 x[16:18]
$PR12
 I
10

$PR13
K R
8 2

$PR14
I V
2 8





- Original Message 
From: Patrick Burns [EMAIL PROTECTED]
To: Allan Kamau [EMAIL PROTECTED]
Sent: Monday, July 30, 2007 12:01:32 PM
Subject: Re: [R] Matrix nesting (was Re: Obtaining summary of frequencies of 
value occurrences for a variable in a multivariate dataset.)

I think you want your main matrix to be of mode
list.  S Poetry talks about this some.

Patrick Burns
[EMAIL PROTECTED]
+44 (0)20 8525 0696
http://www.burns-stat.com
(home of S Poetry and A Guide for the Unwilling S User)

Allan Kamau wrote:

Hi







!--
@page { size: 21cm 29.7cm; margin: 2cm }
P { margin-bottom: 0.21cm }
--


I would like to nest matrices, is there
a way of doing so, I am getting “number of items to replace is not
a multiple of replacement length” errors (probably R is trying to
flatten the matrix into a vector and complains if the vector is
larger than 1 element during the insert)

I have a matrix (see below) in which I
would like to place one other matrices in to each k[2,i] position
(where i is value between 1 to 4)

Why – each value in k[1,i] may
represent several (1or more) key-value results which I would like to
capture in the corresponding k[2,i] element.





  

k



[,1]   [,2]   [,3]  
[,4]

myVariableNames PR10 PR11
PR12 PR13

x2  00
   00

  


  






Allan.



- Original Message 
From: Allan Kamau [EMAIL PROTECTED]
To: jim holtman [EMAIL PROTECTED]
Cc: r-help@stat.math.ethz.ch
Sent: Saturday, July 28, 2007 2:48:47 PM
Subject: Re: [R] Obtaining summary of frequencies of value occurrences for a 
variable in a multivariate dataset.

Hi Jim,
The problem description.
I am trying to identify mutations in a given gene from
a particular genome (biological genome sequence).
I have two CSV files consisting of sequences. One file
consists of reference (documented,curated accepted as
standard) sequences. The other consists of sample
sequences I am trying to identify mutations within. In
both files the an individual sequence is contained in
a single record, it’s amino acid residues ( the actual
sequence of alphabets each representing a given amino
acid for example “A” stands for “Alanine”, “C” for
Cysteine and so on) are each allocated a single field
in the CSV file.
The sequences in both files have been well aligned,
each contain 115 residues with the first residue is
contained in the field 5. The fields 1 to 4 are
allocated for metadata (name of sequence and so on).
My task is to compile a residue occurrence count for
each residue present in a given field in the reference
sequence dataset and use this information when reading
each sequence in the sample dataset to identify a
mutation. For example for position 9 of the sample
sequence “bb” a “P” is found and according to our
reference sequence dataset of summaries, at position 9
“P” may not even exist or may have an occurrence of
10% or so will be classified as mutation, (I could
employ a cut of parameter for mutation
classification).


Allan.

--- jim holtman [EMAIL PROTECTED] wrote:

  

results=()#character()
myVariableNames=names(x.val)
results[length(myVariableNames)]-NA

for (i in myVariableNames){
results[i]-names(x.val[[i]])# this does not
work it returns a
NULL (how can i convert this to x.val$somevalue ?
)
}



On 7/27/07, Allan Kamau [EMAIL PROTECTED]
wrote:


Hi All,
I am having difficulties finding a way to find a
  

substitute to the command names(v.val$PR14) so
that I could generate the command on the fly for all
PR14 to PR200 (please see the previous discussion
below to understand what the object x.val contains)
. I have tried the following


results=()#character()
myVariableNames=names(x.val)
results[length(myVariableNames)]-NA


for


as.vector(unlist(strsplit(str,,)),mode=list)


+results[i]-names(x.val$i)# this does not
  

work it returns a NULL (how can i convert this to
x.val$somevalue ? )


}


Allan.


- Original Message 
From: Allan Kamau [EMAIL PROTECTED]
To: r-help@stat.math.ethz.ch
Sent: Thursday, July 26, 2007 10:03:17 AM
Subject: Re: [R] Obtaining summary of frequencies
  

of value occurrences for a variable in a
multivariate dataset.


Thanks so much Jim, Andaikalavan, Gabor and others
  

for the help and suggestions.


The solution will result in a matrix containing
  

nested matrices to enable each variable name, each
variables distinct value and the count of the
distinct value to be accessible

Re: [R] Obtaining summary of frequencies of value occurrences for a variable in a multivariate dataset.

2007-07-28 Thread Allan Kamau
Hi Jim,
The problem description.
I am trying to identify mutations in a given gene from
a particular genome (biological genome sequence).
I have two CSV files consisting of sequences. One file
consists of reference (documented,curated accepted as
standard) sequences. The other consists of sample
sequences I am trying to identify mutations within. In
both files the an individual sequence is contained in
a single record, it’s amino acid residues ( the actual
sequence of alphabets each representing a given amino
acid for example “A” stands for “Alanine”, “C” for
Cysteine and so on) are each allocated a single field
in the CSV file.
The sequences in both files have been well aligned,
each contain 115 residues with the first residue is
contained in the field 5. The fields 1 to 4 are
allocated for metadata (name of sequence and so on).
My task is to compile a residue occurrence count for
each residue present in a given field in the reference
sequence dataset and use this information when reading
each sequence in the sample dataset to identify a
mutation. For example for position 9 of the sample
sequence “bb” a “P” is found and according to our
reference sequence dataset of summaries, at position 9
“P” may not even exist or may have an occurrence of
10% or so will be classified as mutation, (I could
employ a cut of parameter for mutation
classification).


Allan.

--- jim holtman [EMAIL PROTECTED] wrote:

 results=()#character()
 myVariableNames=names(x.val)
 results[length(myVariableNames)]-NA
 
 for (i in myVariableNames){
 results[i]-names(x.val[[i]])# this does not
 work it returns a
 NULL (how can i convert this to x.val$somevalue ?
 )
 }
 
 
 
 On 7/27/07, Allan Kamau [EMAIL PROTECTED]
 wrote:
  Hi All,
  I am having difficulties finding a way to find a
 substitute to the command names(v.val$PR14) so
 that I could generate the command on the fly for all
 PR14 to PR200 (please see the previous discussion
 below to understand what the object x.val contains)
 . I have tried the following
 
  results=()#character()
  myVariableNames=names(x.val)
  results[length(myVariableNames)]-NA
 
  for
 as.vector(unlist(strsplit(str,,)),mode=list)
  +results[i]-names(x.val$i)# this does not
 work it returns a NULL (how can i convert this to
 x.val$somevalue ? )
  }
 
  Allan.
 
 
  - Original Message 
  From: Allan Kamau [EMAIL PROTECTED]
  To: r-help@stat.math.ethz.ch
  Sent: Thursday, July 26, 2007 10:03:17 AM
  Subject: Re: [R] Obtaining summary of frequencies
 of value occurrences for a variable in a
 multivariate dataset.
 
  Thanks so much Jim, Andaikalavan, Gabor and others
 for the help and suggestions.
  The solution will result in a matrix containing
 nested matrices to enable each variable name, each
 variables distinct value and the count of the
 distinct value to be accessible individually.
  The main matrix will contain the variable names,
 the first level nested matrices will consist of the
 variables unique values, and each such variable
 entry will contain a one element vector to contain
 the count or occurrence frequency.
  This matrix can now be used in comparing other
 similar datasets for variable values and their
 frequencies.
 
  Building on the input received so far, a probable
 solution in building the matrix will include the
 following.
 
 
  1)I reading the csv file (containing column
 headers)
 

my_data=read.table(path/to/my/data.csv,header=TRUE,sep=,,dec=.,fill=TRUE)
 
  2)I group the values in each variable producing an
 occurrence count(frequency)
  x.val-apply(my_data,2,table)
 
  3)I obtain a vector of the names of the variables
 in the table
  names(x.val)
 
  4)Now I make use of the names (obtained in step 3)
 to obtain a vector of distinct values in a given
 variable (in the example below the variable name is
 $PR14)
  names(v.val$PR14)
 
  5)I obtain a vector (with one element) of the
 frequency of a value obtained from the step above
 (in our example the value is V)
  as.vector(x.val$PR14[V])
 
  Todo:
  Now I will need to place the steps above in a
 script (consisting of loops) to build the matrix,
 step 4 and 5 seem tricky to do programatically.
 
  Allan.
 
 
  - Original Message 
  From: jim holtman [EMAIL PROTECTED]
  To: Allan Kamau [EMAIL PROTECTED]
  Cc: Adaikalavan Ramasamy [EMAIL PROTECTED];
 r-help@stat.math.ethz.ch
  Sent: Wednesday, July 25, 2007 1:50:55 PM
  Subject: Re: [R] Obtaining summary of frequencies
 of value occurrences for a variable in a
 multivariate dataset.
 
  Also if you want to access the individual values,
 you can just leave
  it as a list:
 
   x.val - apply(x, 2, table)
   # access each value
   x.val$PR14[V]
  V
  8
 
 
 
  On 7/25/07, Allan Kamau [EMAIL PROTECTED]
 wrote:
   A subset of the data looks as follows
  
df[1:10,14:20]
 PR10 PR11 PR12 PR13 PR14 PR15 PR16
   1 VTIKVGD
   2 VSIKVGG
   3 VTIRVGG
   4 VSI

Re: [R] Obtaining summary of frequencies of value occurrences for a variable in a multivariate dataset.

2007-07-27 Thread Allan Kamau
Hi All,
I am having difficulties finding a way to find a substitute to the command 
names(v.val$PR14) so that I could generate the command on the fly for all 
PR14 to PR200 (please see the previous discussion below to understand what the 
object x.val contains) . I have tried the following 

results=()#character()
myVariableNames=names(x.val)
results[length(myVariableNames)]-NA

for (i in myVariableNames){
+results[i]-names(x.val$i)# this does not work it returns a NULL (how 
can i convert this to x.val$somevalue ? )
}

Allan.


- Original Message 
From: Allan Kamau [EMAIL PROTECTED]
To: r-help@stat.math.ethz.ch
Sent: Thursday, July 26, 2007 10:03:17 AM
Subject: Re: [R] Obtaining summary of frequencies of value occurrences for a 
variable in a multivariate dataset.

Thanks so much Jim, Andaikalavan, Gabor and others for the help and suggestions.
The solution will result in a matrix containing nested matrices to enable each 
variable name, each variables distinct value and the count of the distinct 
value to be accessible individually.
The main matrix will contain the variable names, the first level nested 
matrices will consist of the variables unique values, and each such variable 
entry will contain a one element vector to contain the count or occurrence 
frequency.
This matrix can now be used in comparing other similar datasets for variable 
values and their frequencies.

Building on the input received so far, a probable solution in building the 
matrix will include the following.


1)I reading the csv file (containing column headers)
my_data=read.table(path/to/my/data.csv,header=TRUE,sep=,,dec=.,fill=TRUE)

2)I group the values in each variable producing an occurrence count(frequency)
x.val-apply(my_data,2,table)

3)I obtain a vector of the names of the variables in the table
names(x.val)

4)Now I make use of the names (obtained in step 3) to obtain a vector of 
distinct values in a given variable (in the example below the variable name is 
$PR14)
names(v.val$PR14)

5)I obtain a vector (with one element) of the frequency of a value obtained 
from the step above (in our example the value is V)
as.vector(x.val$PR14[V])

Todo:
Now I will need to place the steps above in a script (consisting of loops) to 
build the matrix, step 4 and 5 seem tricky to do programatically.

Allan.


- Original Message 
From: jim holtman [EMAIL PROTECTED]
To: Allan Kamau [EMAIL PROTECTED]
Cc: Adaikalavan Ramasamy [EMAIL PROTECTED]; r-help@stat.math.ethz.ch
Sent: Wednesday, July 25, 2007 1:50:55 PM
Subject: Re: [R] Obtaining summary of frequencies of value occurrences for a 
variable in a multivariate dataset.

Also if you want to access the individual values, you can just leave
it as a list:

 x.val - apply(x, 2, table)
 # access each value
 x.val$PR14[V]
V
8



On 7/25/07, Allan Kamau [EMAIL PROTECTED] wrote:
 A subset of the data looks as follows

  df[1:10,14:20]
   PR10 PR11 PR12 PR13 PR14 PR15 PR16
 1 VTIKVGD
 2 VSIKVGG
 3 VTIRVGG
 4 VSIKIGG
 5 VSIKVGG
 6 VSIRVGG
 7 VTIKIGG
 8 VSIKVEG
 9 VSIKVGG
 10VSIKVGG

 The result I would like is as follows

 PR10PR11  PR12   ...
 [V:10][S:7,T:3][I:10]

 The result can be in a matrix or a vector and each variablename, value and 
 frequency should be accessible so as to be used for comparisons with another 
 dataset later.
 The frequency can be a count or a percentage.


 Allan.


 - Original Message 
 From: Adaikalavan Ramasamy [EMAIL PROTECTED]
 To: Allan Kamau [EMAIL PROTECTED]
 Cc: r-help@stat.math.ethz.ch
 Sent: Tuesday, July 24, 2007 10:21:51 PM
 Subject: Re: [R] Obtaining summary of frequencies of value occurrences for a 
 variable in a multivariate dataset.

 The name of the table should give you the value. And if you have a
 matrix, you just need to convert it into a vector first.

   m - matrix( LETTERS[ c(1:3, 3:5, 2:4) ], nc=3 )
   m
  [,1] [,2] [,3]
 [1,] A  C  B
 [2,] B  D  C
 [3,] C  E  D
   tb - table( as.vector(m) )
   tb

 A B C D E
 1 2 3 2 1
   paste( names(tb), :, tb, sep= )
 [1] A:1 B:2 C:3 D:2 E:1

 If this is not what you want, then please give a simple example.

 Regards, Adai



 Allan Kamau wrote:
  Hi all,
  If the question below as been answered before I
  apologize for the posting.
  I would like to get the frequencies of occurrence of
  all values in a given variable in a multivariate
  dataset. In short for each variable (or field) a
  summary of values contained with in a value:frequency
  pair, there can be many such pairs for a given
  variable. I would like to do the same for several such
  variables.
  I have used table() but am unable to extract the
  individual value and frequency values.
  Please advise

Re: [R] Obtaining summary of frequencies of value occurrences for a variable in a multivariate dataset.

2007-07-26 Thread Allan Kamau
Thanks so much Jim, Andaikalavan, Gabor and others for the help and suggestions.
The solution will result in a matrix containing nested matrices to enable each 
variable name, each variables distinct value and the count of the distinct 
value to be accessible individually.
The main matrix will contain the variable names, the first level nested 
matrices will consist of the variables unique values, and each such variable 
entry will contain a one element vector to contain the count or occurrence 
frequency.
This matrix can now be used in comparing other similar datasets for variable 
values and their frequencies.

Building on the input received so far, a probable solution in building the 
matrix will include the following.


1)I reading the csv file (containing column headers)
my_data=read.table(path/to/my/data.csv,header=TRUE,sep=,,dec=.,fill=TRUE)

2)I group the values in each variable producing an occurrence count(frequency)
x.val-apply(my_data,2,table)

3)I obtain a vector of the names of the variables in the table
names(x.val)

4)Now I make use of the names (obtained in step 3) to obtain a vector of 
distinct values in a given variable (in the example below the variable name is 
$PR14)
names(v.val$PR14)

5)I obtain a vector (with one element) of the frequency of a value obtained 
from the step above (in our example the value is V)
as.vector(x.val$PR14[V])

Todo:
Now I will need to place the steps above in a script (consisting of loops) to 
build the matrix, step 4 and 5 seem tricky to do programatically.

Allan.


- Original Message 
From: jim holtman [EMAIL PROTECTED]
To: Allan Kamau [EMAIL PROTECTED]
Cc: Adaikalavan Ramasamy [EMAIL PROTECTED]; r-help@stat.math.ethz.ch
Sent: Wednesday, July 25, 2007 1:50:55 PM
Subject: Re: [R] Obtaining summary of frequencies of value occurrences for a 
variable in a multivariate dataset.

Also if you want to access the individual values, you can just leave
it as a list:

 x.val - apply(x, 2, table)
 # access each value
 x.val$PR14[V]
V
8



On 7/25/07, Allan Kamau [EMAIL PROTECTED] wrote:
 A subset of the data looks as follows

  df[1:10,14:20]
   PR10 PR11 PR12 PR13 PR14 PR15 PR16
 1 VTIKVGD
 2 VSIKVGG
 3 VTIRVGG
 4 VSIKIGG
 5 VSIKVGG
 6 VSIRVGG
 7 VTIKIGG
 8 VSIKVEG
 9 VSIKVGG
 10VSIKVGG

 The result I would like is as follows

 PR10PR11  PR12   ...
 [V:10][S:7,T:3][I:10]

 The result can be in a matrix or a vector and each variablename, value and 
 frequency should be accessible so as to be used for comparisons with another 
 dataset later.
 The frequency can be a count or a percentage.


 Allan.


 - Original Message 
 From: Adaikalavan Ramasamy [EMAIL PROTECTED]
 To: Allan Kamau [EMAIL PROTECTED]
 Cc: r-help@stat.math.ethz.ch
 Sent: Tuesday, July 24, 2007 10:21:51 PM
 Subject: Re: [R] Obtaining summary of frequencies of value occurrences for a 
 variable in a multivariate dataset.

 The name of the table should give you the value. And if you have a
 matrix, you just need to convert it into a vector first.

   m - matrix( LETTERS[ c(1:3, 3:5, 2:4) ], nc=3 )
   m
  [,1] [,2] [,3]
 [1,] A  C  B
 [2,] B  D  C
 [3,] C  E  D
   tb - table( as.vector(m) )
   tb

 A B C D E
 1 2 3 2 1
   paste( names(tb), :, tb, sep= )
 [1] A:1 B:2 C:3 D:2 E:1

 If this is not what you want, then please give a simple example.

 Regards, Adai



 Allan Kamau wrote:
  Hi all,
  If the question below as been answered before I
  apologize for the posting.
  I would like to get the frequencies of occurrence of
  all values in a given variable in a multivariate
  dataset. In short for each variable (or field) a
  summary of values contained with in a value:frequency
  pair, there can be many such pairs for a given
  variable. I would like to do the same for several such
  variables.
  I have used table() but am unable to extract the
  individual value and frequency values.
  Please advise.
 
  Allan.
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
 

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman

Re: [R] Obtaining summary of frequencies of value occurrences for a variable in a multivariate dataset.

2007-07-25 Thread Allan Kamau
A subset of the data looks as follows

 df[1:10,14:20]
   PR10 PR11 PR12 PR13 PR14 PR15 PR16
1 VTIKVGD
2 VSIKVGG
3 VTIRVGG
4 VSIKIGG
5 VSIKVGG
6 VSIRVGG
7 VTIKIGG
8 VSIKVEG
9 VSIKVGG
10VSIKVGG

The result I would like is as follows

PR10PR11  PR12   ...
[V:10][S:7,T:3][I:10]

The result can be in a matrix or a vector and each variablename, value and 
frequency should be accessible so as to be used for comparisons with another 
dataset later.
The frequency can be a count or a percentage.


Allan.


- Original Message 
From: Adaikalavan Ramasamy [EMAIL PROTECTED]
To: Allan Kamau [EMAIL PROTECTED]
Cc: r-help@stat.math.ethz.ch
Sent: Tuesday, July 24, 2007 10:21:51 PM
Subject: Re: [R] Obtaining summary of frequencies of value occurrences for a 
variable in a multivariate dataset.

The name of the table should give you the value. And if you have a 
matrix, you just need to convert it into a vector first.

  m - matrix( LETTERS[ c(1:3, 3:5, 2:4) ], nc=3 )
  m
  [,1] [,2] [,3]
[1,] A  C  B
[2,] B  D  C
[3,] C  E  D
  tb - table( as.vector(m) )
  tb

A B C D E
1 2 3 2 1
  paste( names(tb), :, tb, sep= )
[1] A:1 B:2 C:3 D:2 E:1

If this is not what you want, then please give a simple example.

Regards, Adai



Allan Kamau wrote:
 Hi all,
 If the question below as been answered before I
 apologize for the posting.
 I would like to get the frequencies of occurrence of
 all values in a given variable in a multivariate
 dataset. In short for each variable (or field) a
 summary of values contained with in a value:frequency
 pair, there can be many such pairs for a given
 variable. I would like to do the same for several such
 variables.
 I have used table() but am unable to extract the
 individual value and frequency values.
 Please advise.
 
 Allan.
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Obtaining summary of frequencies of value occurrences for a variable in a multivariate dataset.

2007-07-24 Thread Allan Kamau
Hi all,
If the question below as been answered before I
apologize for the posting.
I would like to get the frequencies of occurrence of
all values in a given variable in a multivariate
dataset. In short for each variable (or field) a
summary of values contained with in a value:frequency
pair, there can be many such pairs for a given
variable. I would like to do the same for several such
variables.
I have used table() but am unable to extract the
individual value and frequency values.
Please advise.

Allan.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.