Re: [aroma.affymetrix] SNP_6 raw intensity Extraction

nawin MOHAMMED Tue, 30 Apr 2013 00:24:34 -0700


Greeting  ,


iam  a Phd student , i  try implement aroma package but its not working 
with me , i don't know where is the error,   and  i  have question the 
shiptype folder  is not exist be default  in annotationdata,  i  just 
 create a folder in annotation data name it chiptype and i put the cdf file 
in it , is that possible ???   please  i need your advice this is my 
program 



> getwd()
[1] 
"C:/Users/nwayyin/Documents/R/win-library/3.0/aroma.affymetrix/annotationData/ChipType"
> ChipType(""HG-U133_Plus_2")
Error: unexpected symbol in "ChipType(""HG"
> ChipType("HG-U133_Plus_2")
Error: could not find function "ChipType"
> chipType("HG-U133_Plus_2")
Error: could not find function "chipType"
> library(aroma.affymetrix)
> ChipeType("HG-U133_Plus_2")
Error: could not find function "ChipeType"
> chipType<-"HG-U133_Plus_2"
> cdf<-AffymetrixcdfFile$byChipType("HG-U133_Plus_2")
Error: object 'AffymetrixcdfFile' not found
> cdf<-AffymetrixcdfFile$bychipType("HG-U133_Plus_2")
Error: object 'AffymetrixcdfFile' not found
> cdf<-AffymetrixcdfFile$byChipType("HG-U133_Plus_2")
Error: object 'AffymetrixcdfFile' not found
> cdf<-HG-U133_Plus_2$bychipType("HG-U133_Plus_2")
Error: object 'HG' not found
> cdf<-AffymetrixcdfFile$bychipType("HG-U133_Plus_2", tags=ChipType)
Error: object 'AffymetrixcdfFile' not found
> 

the error is  with cdf file  which can not be read 
i  download all the package 

  source("http://bioconductor.org/biocLite.R";)

biocLite("biomaRt")

 hbInstall("aroma.affymetrix")

thank you 


On Tuesday, April 30, 2013 8:00:45 AM UTC+1, jonathan....@mail.dcu.ie wrote:
>
> Hi Henrik,
>
> That is exactly what I was looking for. I was missing the "Order by Cell 
> Indices".
> Thank you.
>
> Jonathan
>
> On Monday, April 29, 2013 11:17:57 PM UTC+2, Henrik Bengtsson wrote:
>>
>> Hi. 
>>
>> On Mon, Apr 29, 2013 at 4:16 AM,  <jonathan....@mail.dcu.ie> wrote: 
>> > Hi Henrik, 
>> > 
>> > Thank you for the reply. 
>> > This is quite helpful however from using the following code I am able 
>> to 
>> > produce a matrix of the intensities for each of the probes... 
>> (replicates 
>> > included) 
>> > 
>> > library(aroma.affymetrix) 
>> > cdf <- AffymetrixCdfFile$byChipType("GenomeWideSNP_6", tags="Full") 
>> > raw_sample <- AffymetrixCelSet$byName("Sample", cdf=cdf) 
>> > ciS<- getCellIndices(cdf, unlist=TRUE, useNames=TRUE) 
>> > s<- sample(ciS, 250) 
>> > head(ciS) 
>> > d<- extractMatrix(cs, cells=ciS, verbose=-50) 
>> > write.table(file="all_probes.txt",df, quote=FALSE, sep="\t", 
>> > row.names=FALSE) 
>> > 
>> > My issue now is that with this matrix I can see the 6+million probes 
>> however 
>> > the probe ID's are not present. Maybe I am missing something but If you 
>> > could help me associate each probe intensity value with the probe ID I 
>> would 
>> > be very grateful. 
>>
>> What do you mean by 'probe ID'; what you expect it do be?  Note that 
>> in Affymetrix terms, there are 'unit' and 'group' (probeset) 
>> IDs/names, but probes don't really have IDs other that an (x,y) 
>> location or an index (as you use it above). 
>>
>> However, you could pull out probe-specific CDF annotation from the CDF 
>> file as follows: 
>>
>> library("aroma.affymetrix"); 
>> cdf <- AffymetrixCdfFile$byChipType("GenomeWideSNP_6", tags="Full"); 
>> # Some example units 
>> units <- c(1012:1013, 950123:950125); 
>> # Read CDF info as data.frame 
>> cdfData <- readDataFrame(cdf, units=units);  # Will take a very long 
>> time if done for many units 
>> # Order by cell indices 
>> o <- order(cdfData$cell); 
>> cdfData <- cdfData[o,]; 
>> 'data.frame':   15 obs. of  16 variables: 
>>  $ unit           : int  1012 1012 1013 1013 1012 1012 101.. 
>>  $ unitType       : chr  "genotyping" "genotyping" "genot".. 
>>  $ unitType       : chr  "genotyping" "genotyping" "genotyping" ... 
>>  $ unitDirection  : chr  "sense" "sense" "sense" "sense" ... 
>>  $ unitNbrOfAtoms : int  6 6 6 6 6 6 6 6 1 1 ... 
>>  $ cell           : int  539387 539388 902651 902652 18384.. 
>>  $ x              : int  706 707 2170 2171 2630 2631 998 9.. 
>>  $ y              : int  201 201 336 336 685 685 1112 1112.. 
>>  $ groupNbrOfAtoms: int  3 3 3 3 3 3 3 3 1 1 ... 
>>  $ cell           : int  539387 539388 902651 902652 ... 
>>  $ unit           : int  1012 1012 1013 1013 1012 1012 101.. 
>>  $ y              : int  201 201 336 336 685 685 1112 1112 1195 ... 
>>  $ unitName       : chr  "SNP_A-2001598" "SNP_A-2001598" ... 
>>  $ unitType       : chr  "genotyping" "genotyping" "genot".. 
>>  $ indexPos       : int  2 1 6 5 4 3 4 3 1 1 ... 
>>  $ cell           : int  539387 539388 902651 902652 18384... 
>>
>> # Note that for some chip types some probes (cells) occur in multiple 
>> # probe sets meaning you may have duplicates.  I don't think this is 
>> # the case for SNP chips though.  Sanity check... 
>> stopifnot(!anyDuplicated(cdfData$cell)); 
>>
>> # Extract the corresponding probe signals from 'csR' (AffymetrixCelSet) 
>> Y <- extractMatrix(csR, cells=cdfData$cell); 
>>
>> # Merge CDF annotation data with signals 
>> data <- cbind(cdfData, Y); 
>>
>> # Save 
>> # [see help("writeDataFrame.data.frame")] 
>> pathname <- writeDataFrame(data, file="all_probes.txt", 
>> header=list(chipType=getChipType(cdf))); 
>>
>>
>> Again, not sure what you're going to use this for/where to import it; 
>> you may end up reinventing the wheel. 
>>
>> Hope this helps 
>>
>> /Henrik 
>>
>> > 
>> > Thanks, 
>> > Jonathan 
>> > 
>> > On Tuesday, April 23, 2013 2:40:56 AM UTC+2, Henrik Bengtsson wrote: 
>> >> 
>> >> Hi Jonathan. 
>> >> 
>> >> On Thu, Apr 11, 2013 at 6:38 AM,  <jonathan....@mail.dcu.ie> wrote: 
>> >> > Hi all, 
>> >> > 
>> >> > I suppose this is a simple enough task even for a newbie like me, I 
>> have 
>> >> > found a similar related post but I have two questions: 
>> >> > 
>> >> > My First Question when I use the following commands in R: 
>> >> > 
>> >> > library(aroma.affymetrix) 
>> >> > 
>> >> > cdf <- AffymetrixCdfFile$byChipType("GenomeWideSNP_6", tags="Full") 
>> >> > cs <- AffymetrixCelSet$byName("Arles", cdf=cdf) 
>> >> > 
>> >> > unit <- indexOf(cdf, "SNP_A-8656720") 
>> >> > y <- readUnits(cs, units=unit) 
>> >> > str(y) 
>> >> > 
>> >> > This allows me to gather the raw intensities for a SNP or CN probe. 
>> as 
>> >> > follows: 
>> >> > $`SNP_A-8656720`$A$intensities 
>> >> >      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] 
>> >> > [,13] 
>> >> > [,14] 
>> >> > [1,]  786  807 1051  879 1971 1447 1826  236 1249  2335  1140   416 
>> >> > 1147 
>> >> > 2054 
>> >> > [2,]  694  823 1027  835 1673 1167 1729  252 1068  2339   982   411 
>> >> > 769 
>> >> > 1786 
>> >> > [3,]  752  665  913  820 1621 1356 1555  248 1344  2362  1417   339 
>> >> > 991 
>> >> > 1835 
>> >> > 
>> >> > 
>> >> > $`SNP_A-8656720`$G 
>> >> > $`SNP_A-8656720`$G$intensities 
>> >> >      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] 
>> >> > [,13] 
>> >> > [,14] 
>> >> > [1,]  273 1014  481 1012  402  383  421 1138  321   861   614  1859 
>> >> > 687 
>> >> > 549 
>> >> > [2,]  222  528  476  825  602  719  460  912  417   796   650  1617 
>> >> > 537 
>> >> > 661 
>> >> > [3,]  259  781  543  754  492  452  550  909  316   743   518  1847 
>> >> > 529 
>> >> > 651 
>> >> > 
>> >> > From the previous post this data is supposed to reference to the A 
>> and B 
>> >> > allele and for the forward and reverse strands. My question is, what 
>> >> > refers 
>> >> > the Allele A/B ($A/$G) and forward / reverse, also by that logic 
>> there 
>> >> > should be 4 sets of data? 
>> >> 
>> >> Starting with SNP chip GenomeWideSNP_5 (sic!), Affymetrix no longer 
>> >> put down probes for both strands.  Instead, they pick(ed) the strand, 
>> >> reverse *or* forward, that worked "the best" (the best genotyping 
>> >> performance).  Try this: 
>> >> 
>> >> > counts <- nbrOfGroupsPerUnit(cdf);  # Slow first time 
>> >> > table(counts); 
>> >> counts 
>> >>      1      2      4 
>> >> 946447 906874   2748 
>> >> 
>> >> As you see, there are only 2,748 SNPs with both strands, whereas the 
>> >> majority only have one. 
>> >> 
>> >> BTW, those "sets" of probes are formally referred to as "unit groups" 
>> >> (it's ok to also call them "probe sets").  To find out which strands 
>> >> the different unit groups are querying, you can do: 
>> >> 
>> >> > unit <- indexOf(cdf, "SNP_A-8656720"); 
>> >> > cdfList <- readUnits(cdf, units=unit); 
>> >> > str(cdfList) 
>> >> List of 1 
>> >>  $ SNP_A-8656720:List of 3 
>> >>   ..$ type     : int 2 
>> >>   ..$ direction: int 2 
>> >>   ..$ groups   :List of 2 
>> >>   .. ..$ A:List of 6 
>> >>   .. .. ..$ x        : int [1:3] 1565 193 1629 
>> >>   .. .. ..$ y        : int [1:3] 83 1466 2038 
>> >>   .. .. ..$ pbase    : chr [1:3] "a" "a" "a" 
>> >>   .. .. ..$ tbase    : chr [1:3] "t" "t" "t" 
>> >>   .. .. ..$ expos    : int [1:3] 13 13 13 
>> >>   .. .. ..$ direction: int 2 
>> >>   .. ..$ G:List of 6 
>> >>   .. .. ..$ x        : int [1:3] 1564 192 1628 
>> >>   .. .. ..$ y        : int [1:3] 83 1466 2038 
>> >>   .. .. ..$ pbase    : chr [1:3] "g" "g" "g" 
>> >>   .. .. ..$ tbase    : chr [1:3] "c" "c" "c" 
>> >>   .. .. ..$ expos    : int [1:3] 13 13 13 
>> >>   .. .. ..$ direction: int 2 
>> >> 
>> >> See that 'direction'?  It specifies the strand, cf. 
>> >> help("readCdfUnits", package="affxparser") [the function used 
>> >> internally by readUnits()]. 
>> >> 
>> >> 
>> >> > As for CN probes this is much more obvious as there is only one 
>> probe 
>> >> > interrogating the position. 
>> >> 
>> >> Correct. 
>> >> 
>> >> > 
>> >> > My Second Question is in relation to expanding this code in order to 
>> >> > generate the raw intensities of all the SNP_ probes and then to run 
>> the 
>> >> > Sam 
>> >> > e for CN_ probes so that I have the raw intensities for each 
>> seperately? 
>> >> 
>> >> > types <- getUnitTypes(cdf);  # Slow first time 
>> >> > table(types); 
>> >> types 
>> >>      1      2      5 
>> >>    621 909622 945826 
>> >> 
>> >> Here type "2" refers to a "SNP" and "5" a "CN unit".   For this chip 
>> >> type you could also infer which the SNPs/CN probes by their unit names 
>> >> ("SNP_", and "CN_"), but that assumes that the names follow that 
>> >> convention which is *not* true for other chip types.  So, use 
>> >> getUnitTypes(): 
>> >> 
>> >> # SNPs 
>> >> > snpUnits <- which(types == 2); 
>> >> > str(snpUnits) 
>> >>  int [1:909622] 622 623 624 625 626 627 628 ... 
>> >> 
>> >> # CN probes (aka non-polymorphic loci). 
>> >> > npUnits <- which(types == 5); 
>> >> > str(npUnits) 
>> >>  int [1:945826] 910244 910245 910246 910247 ... 
>> >> 
>> >> Hope this helps 
>> >> 
>> >> /Henrik 
>> >> 
>> >> > 
>> >> > 
>> >> > Cheers, 
>> >> > Jonathan 
>> >> > 
>> >> > -- 
>> >> > -- 
>> >> > When reporting problems on aroma.affymetrix, make sure 1) to run the 
>> >> > latest 
>> >> > version of the package, 2) to report the output of sessionInfo() and 
>> >> > traceback(), and 3) to post a complete code example. 
>> >> > 
>> >> > 
>> >> > You received this message because you are subscribed to the Google 
>> >> > Groups 
>> >> > "aroma.affymetrix" group with website http://www.aroma-project.org/. 
>>
>> >> > To post to this group, send email to aroma-af...@googlegroups.com 
>> >> > To unsubscribe and other options, go to 
>> >> > http://www.aroma-project.org/forum/ 
>> >> > 
>> >> > --- 
>> >> > You received this message because you are subscribed to the Google 
>> >> > Groups 
>> >> > "aroma.affymetrix" group. 
>> >> > To unsubscribe from this group and stop receiving emails from it, 
>> send 
>> >> > an 
>> >> > email to aroma-affymetr...@googlegroups.com. 
>> >> > For more options, visit https://groups.google.com/groups/opt_out. 
>> >> > 
>> >> > 
>>
>

-- 
-- 
When reporting problems on aroma.affymetrix, make sure 1) to run the latest 
version of the package, 2) to report the output of sessionInfo() and 
traceback(), and 3) to post a complete code example.


You received this message because you are subscribed to the Google Groups 
"aroma.affymetrix" group with website http://www.aroma-project.org/.
To post to this group, send email to aroma-affymetrix@googlegroups.com
To unsubscribe and other options, go to http://www.aroma-project.org/forum/

--- 
You received this message because you are subscribed to the Google Groups 
"aroma.affymetrix" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to aroma-affymetrix+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Re: [aroma.affymetrix] SNP_6 raw intensity Extraction

Reply via email to