Greeting ,
iam a Phd student , i try implement aroma package but its not working with me , i don't know where is the error, and i have question the shiptype folder is not exist be default in annotationdata, i just create a folder in annotation data name it chiptype and i put the cdf file in it , is that possible ??? please i need your advice this is my program > getwd() [1] "C:/Users/nwayyin/Documents/R/win-library/3.0/aroma.affymetrix/annotationData/ChipType" > ChipType(""HG-U133_Plus_2") Error: unexpected symbol in "ChipType(""HG" > ChipType("HG-U133_Plus_2") Error: could not find function "ChipType" > chipType("HG-U133_Plus_2") Error: could not find function "chipType" > library(aroma.affymetrix) > ChipeType("HG-U133_Plus_2") Error: could not find function "ChipeType" > chipType<-"HG-U133_Plus_2" > cdf<-AffymetrixcdfFile$byChipType("HG-U133_Plus_2") Error: object 'AffymetrixcdfFile' not found > cdf<-AffymetrixcdfFile$bychipType("HG-U133_Plus_2") Error: object 'AffymetrixcdfFile' not found > cdf<-AffymetrixcdfFile$byChipType("HG-U133_Plus_2") Error: object 'AffymetrixcdfFile' not found > cdf<-HG-U133_Plus_2$bychipType("HG-U133_Plus_2") Error: object 'HG' not found > cdf<-AffymetrixcdfFile$bychipType("HG-U133_Plus_2", tags=ChipType) Error: object 'AffymetrixcdfFile' not found > the error is with cdf file which can not be read i download all the package source("http://bioconductor.org/biocLite.R") biocLite("biomaRt") hbInstall("aroma.affymetrix") thank you On Tuesday, April 30, 2013 8:00:45 AM UTC+1, jonathan....@mail.dcu.ie wrote: > > Hi Henrik, > > That is exactly what I was looking for. I was missing the "Order by Cell > Indices". > Thank you. > > Jonathan > > On Monday, April 29, 2013 11:17:57 PM UTC+2, Henrik Bengtsson wrote: >> >> Hi. >> >> On Mon, Apr 29, 2013 at 4:16 AM, <jonathan....@mail.dcu.ie> wrote: >> > Hi Henrik, >> > >> > Thank you for the reply. >> > This is quite helpful however from using the following code I am able >> to >> > produce a matrix of the intensities for each of the probes... >> (replicates >> > included) >> > >> > library(aroma.affymetrix) >> > cdf <- AffymetrixCdfFile$byChipType("GenomeWideSNP_6", tags="Full") >> > raw_sample <- AffymetrixCelSet$byName("Sample", cdf=cdf) >> > ciS<- getCellIndices(cdf, unlist=TRUE, useNames=TRUE) >> > s<- sample(ciS, 250) >> > head(ciS) >> > d<- extractMatrix(cs, cells=ciS, verbose=-50) >> > write.table(file="all_probes.txt",df, quote=FALSE, sep="\t", >> > row.names=FALSE) >> > >> > My issue now is that with this matrix I can see the 6+million probes >> however >> > the probe ID's are not present. Maybe I am missing something but If you >> > could help me associate each probe intensity value with the probe ID I >> would >> > be very grateful. >> >> What do you mean by 'probe ID'; what you expect it do be? Note that >> in Affymetrix terms, there are 'unit' and 'group' (probeset) >> IDs/names, but probes don't really have IDs other that an (x,y) >> location or an index (as you use it above). >> >> However, you could pull out probe-specific CDF annotation from the CDF >> file as follows: >> >> library("aroma.affymetrix"); >> cdf <- AffymetrixCdfFile$byChipType("GenomeWideSNP_6", tags="Full"); >> # Some example units >> units <- c(1012:1013, 950123:950125); >> # Read CDF info as data.frame >> cdfData <- readDataFrame(cdf, units=units); # Will take a very long >> time if done for many units >> # Order by cell indices >> o <- order(cdfData$cell); >> cdfData <- cdfData[o,]; >> 'data.frame': 15 obs. of 16 variables: >> $ unit : int 1012 1012 1013 1013 1012 1012 101.. >> $ unitType : chr "genotyping" "genotyping" "genot".. >> $ unitType : chr "genotyping" "genotyping" "genotyping" ... >> $ unitDirection : chr "sense" "sense" "sense" "sense" ... >> $ unitNbrOfAtoms : int 6 6 6 6 6 6 6 6 1 1 ... >> $ cell : int 539387 539388 902651 902652 18384.. >> $ x : int 706 707 2170 2171 2630 2631 998 9.. >> $ y : int 201 201 336 336 685 685 1112 1112.. >> $ groupNbrOfAtoms: int 3 3 3 3 3 3 3 3 1 1 ... >> $ cell : int 539387 539388 902651 902652 ... >> $ unit : int 1012 1012 1013 1013 1012 1012 101.. >> $ y : int 201 201 336 336 685 685 1112 1112 1195 ... >> $ unitName : chr "SNP_A-2001598" "SNP_A-2001598" ... >> $ unitType : chr "genotyping" "genotyping" "genot".. >> $ indexPos : int 2 1 6 5 4 3 4 3 1 1 ... >> $ cell : int 539387 539388 902651 902652 18384... >> >> # Note that for some chip types some probes (cells) occur in multiple >> # probe sets meaning you may have duplicates. I don't think this is >> # the case for SNP chips though. Sanity check... >> stopifnot(!anyDuplicated(cdfData$cell)); >> >> # Extract the corresponding probe signals from 'csR' (AffymetrixCelSet) >> Y <- extractMatrix(csR, cells=cdfData$cell); >> >> # Merge CDF annotation data with signals >> data <- cbind(cdfData, Y); >> >> # Save >> # [see help("writeDataFrame.data.frame")] >> pathname <- writeDataFrame(data, file="all_probes.txt", >> header=list(chipType=getChipType(cdf))); >> >> >> Again, not sure what you're going to use this for/where to import it; >> you may end up reinventing the wheel. >> >> Hope this helps >> >> /Henrik >> >> > >> > Thanks, >> > Jonathan >> > >> > On Tuesday, April 23, 2013 2:40:56 AM UTC+2, Henrik Bengtsson wrote: >> >> >> >> Hi Jonathan. >> >> >> >> On Thu, Apr 11, 2013 at 6:38 AM, <jonathan....@mail.dcu.ie> wrote: >> >> > Hi all, >> >> > >> >> > I suppose this is a simple enough task even for a newbie like me, I >> have >> >> > found a similar related post but I have two questions: >> >> > >> >> > My First Question when I use the following commands in R: >> >> > >> >> > library(aroma.affymetrix) >> >> > >> >> > cdf <- AffymetrixCdfFile$byChipType("GenomeWideSNP_6", tags="Full") >> >> > cs <- AffymetrixCelSet$byName("Arles", cdf=cdf) >> >> > >> >> > unit <- indexOf(cdf, "SNP_A-8656720") >> >> > y <- readUnits(cs, units=unit) >> >> > str(y) >> >> > >> >> > This allows me to gather the raw intensities for a SNP or CN probe. >> as >> >> > follows: >> >> > $`SNP_A-8656720`$A$intensities >> >> > [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] >> >> > [,13] >> >> > [,14] >> >> > [1,] 786 807 1051 879 1971 1447 1826 236 1249 2335 1140 416 >> >> > 1147 >> >> > 2054 >> >> > [2,] 694 823 1027 835 1673 1167 1729 252 1068 2339 982 411 >> >> > 769 >> >> > 1786 >> >> > [3,] 752 665 913 820 1621 1356 1555 248 1344 2362 1417 339 >> >> > 991 >> >> > 1835 >> >> > >> >> > >> >> > $`SNP_A-8656720`$G >> >> > $`SNP_A-8656720`$G$intensities >> >> > [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] >> >> > [,13] >> >> > [,14] >> >> > [1,] 273 1014 481 1012 402 383 421 1138 321 861 614 1859 >> >> > 687 >> >> > 549 >> >> > [2,] 222 528 476 825 602 719 460 912 417 796 650 1617 >> >> > 537 >> >> > 661 >> >> > [3,] 259 781 543 754 492 452 550 909 316 743 518 1847 >> >> > 529 >> >> > 651 >> >> > >> >> > From the previous post this data is supposed to reference to the A >> and B >> >> > allele and for the forward and reverse strands. My question is, what >> >> > refers >> >> > the Allele A/B ($A/$G) and forward / reverse, also by that logic >> there >> >> > should be 4 sets of data? >> >> >> >> Starting with SNP chip GenomeWideSNP_5 (sic!), Affymetrix no longer >> >> put down probes for both strands. Instead, they pick(ed) the strand, >> >> reverse *or* forward, that worked "the best" (the best genotyping >> >> performance). Try this: >> >> >> >> > counts <- nbrOfGroupsPerUnit(cdf); # Slow first time >> >> > table(counts); >> >> counts >> >> 1 2 4 >> >> 946447 906874 2748 >> >> >> >> As you see, there are only 2,748 SNPs with both strands, whereas the >> >> majority only have one. >> >> >> >> BTW, those "sets" of probes are formally referred to as "unit groups" >> >> (it's ok to also call them "probe sets"). To find out which strands >> >> the different unit groups are querying, you can do: >> >> >> >> > unit <- indexOf(cdf, "SNP_A-8656720"); >> >> > cdfList <- readUnits(cdf, units=unit); >> >> > str(cdfList) >> >> List of 1 >> >> $ SNP_A-8656720:List of 3 >> >> ..$ type : int 2 >> >> ..$ direction: int 2 >> >> ..$ groups :List of 2 >> >> .. ..$ A:List of 6 >> >> .. .. ..$ x : int [1:3] 1565 193 1629 >> >> .. .. ..$ y : int [1:3] 83 1466 2038 >> >> .. .. ..$ pbase : chr [1:3] "a" "a" "a" >> >> .. .. ..$ tbase : chr [1:3] "t" "t" "t" >> >> .. .. ..$ expos : int [1:3] 13 13 13 >> >> .. .. ..$ direction: int 2 >> >> .. ..$ G:List of 6 >> >> .. .. ..$ x : int [1:3] 1564 192 1628 >> >> .. .. ..$ y : int [1:3] 83 1466 2038 >> >> .. .. ..$ pbase : chr [1:3] "g" "g" "g" >> >> .. .. ..$ tbase : chr [1:3] "c" "c" "c" >> >> .. .. ..$ expos : int [1:3] 13 13 13 >> >> .. .. ..$ direction: int 2 >> >> >> >> See that 'direction'? It specifies the strand, cf. >> >> help("readCdfUnits", package="affxparser") [the function used >> >> internally by readUnits()]. >> >> >> >> >> >> > As for CN probes this is much more obvious as there is only one >> probe >> >> > interrogating the position. >> >> >> >> Correct. >> >> >> >> > >> >> > My Second Question is in relation to expanding this code in order to >> >> > generate the raw intensities of all the SNP_ probes and then to run >> the >> >> > Sam >> >> > e for CN_ probes so that I have the raw intensities for each >> seperately? >> >> >> >> > types <- getUnitTypes(cdf); # Slow first time >> >> > table(types); >> >> types >> >> 1 2 5 >> >> 621 909622 945826 >> >> >> >> Here type "2" refers to a "SNP" and "5" a "CN unit". For this chip >> >> type you could also infer which the SNPs/CN probes by their unit names >> >> ("SNP_", and "CN_"), but that assumes that the names follow that >> >> convention which is *not* true for other chip types. So, use >> >> getUnitTypes(): >> >> >> >> # SNPs >> >> > snpUnits <- which(types == 2); >> >> > str(snpUnits) >> >> int [1:909622] 622 623 624 625 626 627 628 ... >> >> >> >> # CN probes (aka non-polymorphic loci). >> >> > npUnits <- which(types == 5); >> >> > str(npUnits) >> >> int [1:945826] 910244 910245 910246 910247 ... >> >> >> >> Hope this helps >> >> >> >> /Henrik >> >> >> >> > >> >> > >> >> > Cheers, >> >> > Jonathan >> >> > >> >> > -- >> >> > -- >> >> > When reporting problems on aroma.affymetrix, make sure 1) to run the >> >> > latest >> >> > version of the package, 2) to report the output of sessionInfo() and >> >> > traceback(), and 3) to post a complete code example. >> >> > >> >> > >> >> > You received this message because you are subscribed to the Google >> >> > Groups >> >> > "aroma.affymetrix" group with website http://www.aroma-project.org/. >> >> >> > To post to this group, send email to aroma-af...@googlegroups.com >> >> > To unsubscribe and other options, go to >> >> > http://www.aroma-project.org/forum/ >> >> > >> >> > --- >> >> > You received this message because you are subscribed to the Google >> >> > Groups >> >> > "aroma.affymetrix" group. >> >> > To unsubscribe from this group and stop receiving emails from it, >> send >> >> > an >> >> > email to aroma-affymetr...@googlegroups.com. >> >> > For more options, visit https://groups.google.com/groups/opt_out. >> >> > >> >> > >> > -- -- When reporting problems on aroma.affymetrix, make sure 1) to run the latest version of the package, 2) to report the output of sessionInfo() and traceback(), and 3) to post a complete code example. You received this message because you are subscribed to the Google Groups "aroma.affymetrix" group with website http://www.aroma-project.org/. To post to this group, send email to aroma-affymetrix@googlegroups.com To unsubscribe and other options, go to http://www.aroma-project.org/forum/ --- You received this message because you are subscribed to the Google Groups "aroma.affymetrix" group. To unsubscribe from this group and stop receiving emails from it, send an email to aroma-affymetrix+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.