Hi Jonathan. On Thu, Apr 11, 2013 at 6:38 AM, <jonathan.crowth...@mail.dcu.ie> wrote: > Hi all, > > I suppose this is a simple enough task even for a newbie like me, I have > found a similar related post but I have two questions: > > My First Question when I use the following commands in R: > > library(aroma.affymetrix) > > cdf <- AffymetrixCdfFile$byChipType("GenomeWideSNP_6", tags="Full") > cs <- AffymetrixCelSet$byName("Arles", cdf=cdf) > > unit <- indexOf(cdf, "SNP_A-8656720") > y <- readUnits(cs, units=unit) > str(y) > > This allows me to gather the raw intensities for a SNP or CN probe. as > follows: > $`SNP_A-8656720`$A$intensities > [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] > [,14] > [1,] 786 807 1051 879 1971 1447 1826 236 1249 2335 1140 416 1147 > 2054 > [2,] 694 823 1027 835 1673 1167 1729 252 1068 2339 982 411 769 > 1786 > [3,] 752 665 913 820 1621 1356 1555 248 1344 2362 1417 339 991 > 1835 > > > $`SNP_A-8656720`$G > $`SNP_A-8656720`$G$intensities > [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] > [,14] > [1,] 273 1014 481 1012 402 383 421 1138 321 861 614 1859 687 > 549 > [2,] 222 528 476 825 602 719 460 912 417 796 650 1617 537 > 661 > [3,] 259 781 543 754 492 452 550 909 316 743 518 1847 529 > 651 > > From the previous post this data is supposed to reference to the A and B > allele and for the forward and reverse strands. My question is, what refers > the Allele A/B ($A/$G) and forward / reverse, also by that logic there > should be 4 sets of data?
Starting with SNP chip GenomeWideSNP_5 (sic!), Affymetrix no longer put down probes for both strands. Instead, they pick(ed) the strand, reverse *or* forward, that worked "the best" (the best genotyping performance). Try this: > counts <- nbrOfGroupsPerUnit(cdf); # Slow first time > table(counts); counts 1 2 4 946447 906874 2748 As you see, there are only 2,748 SNPs with both strands, whereas the majority only have one. BTW, those "sets" of probes are formally referred to as "unit groups" (it's ok to also call them "probe sets"). To find out which strands the different unit groups are querying, you can do: > unit <- indexOf(cdf, "SNP_A-8656720"); > cdfList <- readUnits(cdf, units=unit); > str(cdfList) List of 1 $ SNP_A-8656720:List of 3 ..$ type : int 2 ..$ direction: int 2 ..$ groups :List of 2 .. ..$ A:List of 6 .. .. ..$ x : int [1:3] 1565 193 1629 .. .. ..$ y : int [1:3] 83 1466 2038 .. .. ..$ pbase : chr [1:3] "a" "a" "a" .. .. ..$ tbase : chr [1:3] "t" "t" "t" .. .. ..$ expos : int [1:3] 13 13 13 .. .. ..$ direction: int 2 .. ..$ G:List of 6 .. .. ..$ x : int [1:3] 1564 192 1628 .. .. ..$ y : int [1:3] 83 1466 2038 .. .. ..$ pbase : chr [1:3] "g" "g" "g" .. .. ..$ tbase : chr [1:3] "c" "c" "c" .. .. ..$ expos : int [1:3] 13 13 13 .. .. ..$ direction: int 2 See that 'direction'? It specifies the strand, cf. help("readCdfUnits", package="affxparser") [the function used internally by readUnits()]. > As for CN probes this is much more obvious as there is only one probe > interrogating the position. Correct. > > My Second Question is in relation to expanding this code in order to > generate the raw intensities of all the SNP_ probes and then to run the Sam > e for CN_ probes so that I have the raw intensities for each seperately? > types <- getUnitTypes(cdf); # Slow first time > table(types); types 1 2 5 621 909622 945826 Here type "2" refers to a "SNP" and "5" a "CN unit". For this chip type you could also infer which the SNPs/CN probes by their unit names ("SNP_", and "CN_"), but that assumes that the names follow that convention which is *not* true for other chip types. So, use getUnitTypes(): # SNPs > snpUnits <- which(types == 2); > str(snpUnits) int [1:909622] 622 623 624 625 626 627 628 ... # CN probes (aka non-polymorphic loci). > npUnits <- which(types == 5); > str(npUnits) int [1:945826] 910244 910245 910246 910247 ... Hope this helps /Henrik > > > Cheers, > Jonathan > > -- > -- > When reporting problems on aroma.affymetrix, make sure 1) to run the latest > version of the package, 2) to report the output of sessionInfo() and > traceback(), and 3) to post a complete code example. > > > You received this message because you are subscribed to the Google Groups > "aroma.affymetrix" group with website http://www.aroma-project.org/. > To post to this group, send email to aroma-affymetrix@googlegroups.com > To unsubscribe and other options, go to http://www.aroma-project.org/forum/ > > --- > You received this message because you are subscribed to the Google Groups > "aroma.affymetrix" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to aroma-affymetrix+unsubscr...@googlegroups.com. > For more options, visit https://groups.google.com/groups/opt_out. > > -- -- When reporting problems on aroma.affymetrix, make sure 1) to run the latest version of the package, 2) to report the output of sessionInfo() and traceback(), and 3) to post a complete code example. You received this message because you are subscribed to the Google Groups "aroma.affymetrix" group with website http://www.aroma-project.org/. To post to this group, send email to aroma-affymetrix@googlegroups.com To unsubscribe and other options, go to http://www.aroma-project.org/forum/ --- You received this message because you are subscribed to the Google Groups "aroma.affymetrix" group. To unsubscribe from this group and stop receiving emails from it, send an email to aroma-affymetrix+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.