Re: [R] Issue with results from 'summary' function in R

2015-09-19 Thread Praveen Surendran
Hi Thierry,

Thank you for the response. I should have looked at the help page.

Kind Regards,
Praveen.

On 18/09/2015 15:51, "Thierry Onkelinx" <thierry.onkel...@inbo.be> wrote:

>This is described in ?summary
>
>> x <- 22072
>> getOption("digits")
>[1] 7
>> summary(x)
>   Min. 1st Qu.  MedianMean 3rd Qu.Max.
>  22070   22070   22070   22070   22070   22070
>> options(digits = 10)
>> summary(x)
>   Min. 1st Qu.  MedianMean 3rd Qu.Max.
>  22072   22072   22072   22072   22072   22072
>ir. Thierry Onkelinx
>Instituut voor natuur- en bosonderzoek / Research Institute for Nature
>and Forest
>team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
>Kliniekstraat 25
>1070 Anderlecht
>Belgium
>
>To call in the statistician after the experiment is done may be no
>more than asking him to perform a post-mortem examination: he may be
>able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher
>The plural of anecdote is not data. ~ Roger Brinner
>The combination of some data and an aching desire for an answer does
>not ensure that a reasonable answer can be extracted from a given body
>of data. ~ John Tukey
>
>
>2015-09-18 14:08 GMT+02:00 Praveen Surendran <ps...@medschl.cam.ac.uk>:
>> Hi all,
>>
>> Attached table (that contains summary for a genetic association study)
>>was read using the command:
>>
>> test <- read.table('testDat.txt',header=FALSE,stringsAsFactors=FALSE)
>>
>> Results from the summary of the attached table is provided below:
>>
>>> summary(test$V5)
>>Min. 1st Qu.  MedianMean 3rd Qu.Max.
>>   22070   22070   22070   22070   22070   22070
>>
>> As we can see column 5 of this table contains only one value - 22072
>> I am confused as to why I am getting a value 22070 in the summary of
>>this column.
>>
>> I tested this using versions of R including - R version 3.2.1
>>(2015-06-18) -- "World-Famous Astronaut"
>>
>> Thank you for looking at this issue.
>> Kind Regards,
>>
>> Praveen.
>>
>>
>>
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>>http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Issue with results from 'summary' function in R

2015-09-18 Thread Praveen Surendran
Hi all,

Attached table (that contains summary for a genetic association study) was read 
using the command:

test <- read.table('testDat.txt',header=FALSE,stringsAsFactors=FALSE)

Results from the summary of the attached table is provided below:

> summary(test$V5)
   Min. 1st Qu.  MedianMean 3rd Qu.Max.
  22070   22070   22070   22070   22070   22070

As we can see column 5 of this table contains only one value - 22072
I am confused as to why I am getting a value 22070 in the summary of this 
column.

I tested this using versions of R including - R version 3.2.1 (2015-06-18) -- 
"World-Famous Astronaut"

Thank you for looking at this issue.
Kind Regards,

Praveen.



1   762320  C   T   22072   0.0169445   0.0169445   748 
1   0.00350047  21324   748 0   0.38843753.9888 
0.000133264 0.994259
1   785989  T   C   22072   0.7928370.79283722182   
0.6337890.0166803   647 45028840-84.0255137.518 
-0.00444316 0.54119
1   865545  G   A   22072   0.00021447  0.00021447  6   
0.6337441   13982   6   0   11.2623 4.91838 0.465569
0.0220305
1   865584  G   A   22072   9.06125e-05 9.06125e-05 4   
1   1   22068   4   0   0.82775 4.01634 0.0513144   0.836716
1   865628  G   A   22072   0.00236503  0.00236503  58  
0.451   12204   58  0   -0.662004   15.2589 
-0.00284324 0.965395
1   865662  G   A   22072   0.00493838  0.00493838  218 
1   1   21854   218 0   -97.186229.5061 -0.11163
0.000988553
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Matrix Multiplication using R.

2013-08-15 Thread Praveen Surendran
Dear Doran, Bert and Roger,

Thank you for attending my query and for your valuable responses.

The task is slightly more complex. Here's the real case... I have genetic 
variation data (40,000 single nucleotide polymorphisms) from 90,000 
individuals. This makes the 90,000 (samples) rows/columns of the matrix and 
40,000 (SNPs) rows/columns of the matrix. Matrix data are genetic variations 
with values 0,1,2 or 3 where 0 is missing. There will be very few individuals 
with missing data. 

The task is to identify the relatedness between these 90,000 individuals using 
their genetic data (0,1,2 or 3). These values needs to be standardised before 
matrix multiplication. This will make the matrix much larger compared to the 
0/1/2/3 matrix and most of these will be real numbers with decimals. 

Bert, I will not be doing a 90,000 x 40,000 %*% 40,000 x 90,000. The plan is to 
load this 9 x 4 matrix into R, then standardise and multiply this in 
batches of 90,000 samples against 500 samples using these 40,000 variants and 
process these in parallel to get 90,000 x 90,000 comparisons. Does that sort of 
clarifies the situation?

I tried loading a 90,000 x 40,000 matrix as a matrix in R this morning on the 
cluster with specifications described in my previous e-mail. This crashed due 
to memory overflow. I am trying for possibilities 

Any comments or thoughts will be greatly appreciated.

Regards,

Praveen.

-Original Message-
From: Roger Koenker [mailto:rkoen...@illinois.edu] 
Sent: 14 August 2013 23:06
To: Praveen Surendran
Cc: r-help@r-project.org
Subject: Re: [R] Matrix Multiplication using R.

In the event that these are moderately sparse matrices, you could try Matrix or 
SparseM.


Roger Koenker
rkoen...@illinois.edu




On Aug 14, 2013, at 10:40 AM, Praveen Surendran wrote:

 Dear all,
 
 I am exploring ways to perform multiplication of a 9 x 4 matrix with 
 it's transpose.
 As expected even a 4 x 100 %*% 100x4 didn't work on my desktop... 
 giving the error Error: cannot allocate vector of length 16
 
 However I am trying to run this on one node (64GB RAM; 2.60 GHz processor) of 
 a high performance computing cluster.
 Appreciate if anyone has any comments on whether it's advisable to perform a 
 matrix multiplication of this size using R and also on any better ways to 
 handle this task.
 
 Kind Regards,
 
 Praveen.
 
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Matrix Multiplication using R.

2013-08-14 Thread Praveen Surendran
Dear all,

I am exploring ways to perform multiplication of a 9 x 4 matrix with 
it's transpose.
As expected even a 4 x 100 %*% 100x4 didn't work on my desktop... 
giving the error Error: cannot allocate vector of length 16

However I am trying to run this on one node (64GB RAM; 2.60 GHz processor) of a 
high performance computing cluster.
Appreciate if anyone has any comments on whether it's advisable to perform a 
matrix multiplication of this size using R and also on any better ways to 
handle this task.

Kind Regards,

Praveen.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Opening SAS file using read.sas7bdat() function in sas7bdat library.

2012-10-29 Thread Praveen Surendran
Hi,

 

I have a file in .sas7bdat format. I tried to open this file using
read.sas7bdat() function.

This gave me an error - Error in read.sas7bdat(bnp_genetic.sas7bdat) : 

  unknown host X64_7PRO.

Could someone tell me what this error means?

 

Thank you,

 

Praveen.

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Opening SAS file using read.sas7bdat() function in sas7bdat library.

2012-10-29 Thread Praveen Surendran
Dear all,

Thank you for the response and.. thanks Marc.
It works with the source file which Matt has at
https://github.com/BioStatMatt/sas7bdat/blob/master/R/sas7bdat.R
which is also attached.

Cheers,

Praveen.

-Original Message-
From: Marc Schwartz [mailto:marc_schwa...@me.com] 
Sent: 29 October 2012 19:14
To: Duncan Murdoch
Cc: Praveen Surendran; r-help@r-project.org
Subject: Re: [R] Opening SAS file using read.sas7bdat() function in sas7bdat
library.

On Oct 29, 2012, at 2:04 PM, Duncan Murdoch murdoch.dun...@gmail.com
wrote:

 On 29/10/2012 2:54 PM, Marc Schwartz wrote:
 On Oct 29, 2012, at 1:28 PM, Praveen Surendran praveen.surend...@ucd.ie
wrote:
 
  Hi,
 
 
 
  I have a file in .sas7bdat format. I tried to open this file using
  read.sas7bdat() function.
 
  This gave me an error - Error in read.sas7bdat(bnp_genetic.sas7bdat)
:
 
   unknown host X64_7PRO.
 
  Could someone tell me what this error means?
 
  Thank you,
 
  Praveen.
 
 
 More than likely, a similar problem as in this recent thread, but for 64
bit, rather than 32 bit:
 
   https://stat.ethz.ch/pipermail/r-help/2012-October/325257.html
 
 If you look at the source for the function, it checks the SAS_host
against a known list, containing
 
 # Host systems known to work
 KNOWNHOST - c(WIN_PRO, WIN_NT, WIN_NTSV, WIN_SRV,
   WIN_ASRV, XP_PRO, XP_HOME, NET_ASRV,
   NET_DSRV, NET_SRV, WIN_98, W32_VSPR,
   WIN, WIN_95, X64_VSPR, X64_ESRV)
 
 Praveen's host is not in that list, so the package author has never tested
it.   But nothing else in the code appears to depend on the host, so it's a
good guess that adding another host string to that list (or changing the
error to a warning) will make it work properly.
 
 Duncan Murdoch


As per that prior thread, Matt has added those to the source on GitHub:

  https://github.com/BioStatMatt/sas7bdat/blob/master/R/sas7bdat.R

at line 86:

# Host systems known to work
KNOWNHOST - c(WIN_PRO, WIN_NT, WIN_NTSV, WIN_SRV,
   WIN_ASRV, XP_PRO, XP_HOME, NET_ASRV,
   NET_DSRV, NET_SRV, WIN_98, W32_VSPR,
   WIN, WIN_95, X64_VSPR, X64_ESRV,
   W32_ESRV, W32_7PRO, W32_VSHO, X64_7HOM,
   X64_7PRO, X64_SRV0)


It's presumably just a matter of Matt releasing an updated version of the
package. There were some comments in that prior thread of communication
issues with Matt, so not sure what is going on there relative to time frame.

Regards,

Marc

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] create string with using paste.

2011-03-29 Thread Praveen Surendran
Hi,

I am trying to create a string -  Double quotes : 
using paste.

a command something like paste('Double','quotes : \',sep= ) prints 
Double quotes : \ where backslash is also printed.
Is there a way to print just ?

Regards,

Praveen.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Getting dates in an SPSS file in right format.

2010-05-18 Thread Praveen Surendran
Dear all,

 

I am trying to read an SPSS file into a data frame in R using method
read.spss(),

sample - read.spss(file.name,to.data.frame=TRUE)

 

But dates in the data.frame 'sample' are coming as integers and not in the
actual date format given in the SPSS file.

Appreciate if anyone can help me to solve this problem.

 

Kind Regards,

 

Praveen Surendran

2G, Complex and Adaptive Systems Laboratory (UCD CASL)

School of Medicine and Medical Sciences

University College Dublin

Belfield, Dublin 4

Ireland.

 

Office : +353-(0)1716 5334

Mobile : +353-(0)8793 13071

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Reading .sav (SPSS) files.

2009-11-12 Thread Praveen Surendran
Hi,

 

I used the method read.spss() in library(foreign) to read a .sav file using
commands.

 

library(foreign)

data.sav - read.spss('masterfile.sav')

mydat - as.data.frame(data.sav)

 

It's throwing some warnings..

 




 

Warning messages:

1: In read.spss(masterfile.sav) :

  masterfile.sav: File-indicated character representation code (1252) looks
like a Windows codepage

2: In read.spss(masterfile.sav) :

  masterfile.sav: Unrecognized record type 7, subtype 20 encountered in
system file

 




 

In the resulting data.frame-'mydat' all date variables are represented by
integers.

 

Q. Are there any other options which can be used with the method read.spss()
to get date variables in the right format?

 

Thanks in advance.

 

Kind Regards,

 

Praveen Surendran

2G, Complex and Adaptive Systems Laboratory (UCD CASL)

School of Medicine and Medical Sciences

University College Dublin

Belfield, Dublin 4

Ireland.

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Replacing multiple elements in a vector !

2009-10-22 Thread Praveen Surendran
Hi,

 

I have a vector with elements 

 

rs.id=c('rs100','rs101','rs102','rs103')

 

And a dataframe 'snp.id'

 

1  SNP_100  rs100

2  SNP_101  rs101

3  SNP_102  rs102

4  SNP_103  rs103

 

Task is to replace rs.id vector with corresponding 'SNP_' ids  in snp.id.

 

Thanks in advance.

 

Praveen Surendran

2G, Complex and Adaptive Systems Laboratory (UCD CASL)

School of Medicine and Medical Sciences

University College Dublin

Belfield, Dublin 4

Ireland.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Getting indeices of intersecting elements.

2009-10-14 Thread Praveen Surendran
Hi,

 

Is there a command to get the indices of intersecting elements of two
vectors as intersect() will give the elements and not its indices.   

 

Thanks in advance.

 

Praveen Surendran

School of Medicine and Medical Sciences

University College Dublin

Belfield, Dublin 4

Ireland.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] index of intersect()

2009-08-09 Thread Praveen Surendran
Hi all,

 

Is there a way to get the index of elements in intersect(x,y) where x and y
are vectors with few common elements.

 

Appreciate your response.

 

Praveen Surendran.

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Finding missing elements by comparing vectors

2009-07-16 Thread Praveen Surendran
Hi,

 

Is there a function in R to do a-b where a and b are two non-numeric sets
(or intersection complement of these two sets).

 

Kind Regards,

 

Praveen.

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] R regular expression to extract words with the query string.

2009-07-08 Thread Praveen Surendran
Hi,

 

Is there a way in R to get the string which matches the expression, where
the expression is a substring of the parent string.

 

Lets say, I have $i - transcript:ENST112334 pid:ENSP12345

What I need is the string pid:ENSP12345 from $i using the query
ENSP.

 

Appreciate your comments.

 

Praveen  Surendran

School of Medicine and Medical Sciences

University College Dublin

Belfiled, Dublin 4

Ireland.

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] R regular expression to extract words with the query string.

2009-07-08 Thread Praveen Surendran
Thanks Henrique.

This is indeed short and quite simple compared to what I was using which
goes like...

 

unlist(strsplit(i,split= ))[grep(ENSP,unlist(strsplit(i,split= )))]
J

 

Cheers,

 

Praveen.

 

From: Henrique Dallazuanna [mailto:www...@gmail.com] 
Sent: 08 July 2009 14:18
To: praveen.surend...@ucd.ie
Cc: r-help@r-project.org
Subject: Re: [R] R regular expression to extract words with the query
string.

 

Try this:

sapply(strsplit(i, ' '), grep, pattern='ENSP', value = T)

On Wed, Jul 8, 2009 at 10:04 AM, Praveen Surendran
praveen.surend...@ucd.ie wrote:

Hi,



Is there a way in R to get the string which matches the expression, where
the expression is a substring of the parent string.



Lets say, I have $i - transcript:ENST112334 pid:ENSP12345

What I need is the string pid:ENSP12345 from $i using the query
ENSP.



Appreciate your comments.



Praveen  Surendran

School of Medicine and Medical Sciences

University College Dublin

Belfiled, Dublin 4

Ireland.




   [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40 S 49° 16' 22 O


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Alternate ways of finding number of occurrence of an element in a vector.

2009-06-19 Thread Praveen Surendran
Hi,

 

I have a vector v and would like to find the number of occurrence of
element x in the same.

Is there a way other than,

 

sum(as.integer(v==x)) or length(which(x==v))

 

to do the this.

 

I have a huge file to process and do this.  Both the above described methods
are pretty slow while dealing with a large vector.

Please have your comments.

 

Praveen Surendran.

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.