Re: [aroma.affymetrix] genotyping crlmm genomewidesnp 6.0
Good afternoon, First of all offer my apologies for the delay of this response. El sábado, 23 de marzo de 2013 22:28:52 UTC+1, Henrik Bengtsson escribió: > > Hi. > > On Sat, Mar 23, 2013 at 4:00 AM, Carles Hernández > > > wrote: > > Good morning, > > > > First of all, thanks for answering so fast. Its really helpful to be > able to > > talk with the main creator of the library. > > > > Going back to the topic, sorry I didn't express myself properly. I "have > no > > idea" what the CEL files contain so, the idea is to analyze the > microarrays > > using, the FreqB, LRR and genotypes. Some of them can are tumoral but I > > can't know. I will use the genotype to classify the probes in AA, Ab and > BB > > in order to study the FreqB compared with LRR and use an external > program > > called MAD. > > But do you agree with me that it does not make sense to classify a SNP > into (AA, AB, BB), i.e. call the genotype, if the SNP is for instance > A, ABB, AAABB, or even worse a mixture of, say, 10% A, 38.5% ABB and > 40.1% AAABB and the rest being the normal AB? So, I still argue that > genotypes will only make sense for SNPs that you know are normal. If > you don't know which samples are normal and which are tumors you will > never know which SNPs/genotype calls you can trust, which to me makes > the (artifical) genotype calls useless. Although I still haven't seen > one, I'm all ear for a good argument for where it makes sense to call > genotypes in a tumor. I'm just trying to safe you from wasting your > time going down the wrong path. > Yes, I agree with that but in fact I want a baf estimation and for that I want to use CRLMM, which also predicts the genotype, but it is not ready for GenomeWideSNP 6 so use the implementation of CRMAv2 which predicts baf pretty well it may be a solution. Could you provide a reference to "MAD" - never heard of it. > Here you can get some information related to "MAD": - http://www.biomedcentral.com/1471-2105/12/166 - http://www.creal.cat/jrgonzalez/software.htm#ancla-MAD > > > So, you said CRLMM is not implemented for GenomeWideSNP 6.0, may I can > > contribute implementing it? > > Certainly, that would be great and most appreciated. Just a heads up, > it's more than a standard programming task. It requires diving into > the oligo::crlmm() code and its algorithm to find out which modules > can be reused and which needs to be ported. The two CrlmmModel.R and > CrlmmModel.EXT.R in aroma.affymetrix/R/ would serve as a good > start/template: > > > https://r-forge.r-project.org/scm/viewvc.php/pkg/aroma.affymetrix/R/CrlmmModel.R?view=markup&root=aroma-dots > > > https://r-forge.r-project.org/scm/viewvc.php/pkg/aroma.affymetrix/R/CrlmmModel.EXT.R?view=markup&root=aroma-dots > > > If you look inside oligo::crlmm() you see that it itself takes two > separate paths depending whether the chip type is (a) > Mapping50K_(Hind|Xba)240 and Mapping250K_(Nsp|Sty) [which is ported to > aroma.affymetrix], or (b) GenomeWideSNP_(5|6) [which is not ported]. > In other words, it's the internal oligo:::genotypeOne() that needs to > be ported. > Actually I am battling with clrmm, oligo and oligoClasses to manage my GenomeWideSNP cel files. My prior is to finish this analysis but may be I will take a hand on this porting, not sure but in mind. > > > > Anyway, thank you to share with us the aroma.affymetrix suite. > > You're welcome - hopefully it makes everyday science a bit easier. > > /Henrik > Lots of thanks for you answers. Carles PS\ Some consideration to apply CRLMM to Affymetrix Axiom and Affymetrix Axiom Exome arrays? > > > > > > El viernes, 22 de marzo de 2013 19:31:10 UTC+1, Henrik Bengtsson > escribió: > >> > >> Hi Carles, > >> > >> the quick answer it that aroma.affymetrix only implements the CRLMM > >> method for the 100K (Mapping50K_Xba142 and Mapping50K_Hind142) and > >> 500K (Mapping250K_Nsp and Mapping250K_Sty) chip types. For newer > >> methods you need to turn to the Bioconductor 'oligo' package. > >> > >> However, what are you going to use the genotypes for? I'm asking > >> because it is rather common, and according to me incorrect, to try to > >> call genotypes in tumor samples. Genotypes are really only defined in > >> normal/germline genomes and most (all?) genotype methods assume that > >> the samples are such. Calling "genotypes" in tumors is rather a > >> problem of inferring parent-specific CNs (PSCNs) - not at the > >> SNP-by-SNP level but in segments along the genome. Contrary to normal > >> PSCNs ("genotypes"), tumor PSCNs may not take discrete levels due > >> clonality and normal contamination. In other words, if you do indeed > >> have tumors, it does not make sense to use CRLMM on them. Instead you > >> want to to PSCN segmentation/calling. > >> > >> Hope this helps > >> > >> Henrik > >> > >> > >> On Fri, Mar 22, 2013 at 7:47 AM, Carles Hernández >
Re: [aroma.affymetrix] genotyping crlmm genomewidesnp 6.0
Hi. On Sat, Mar 23, 2013 at 4:00 AM, Carles Hernández wrote: > Good morning, > > First of all, thanks for answering so fast. Its really helpful to be able to > talk with the main creator of the library. > > Going back to the topic, sorry I didn't express myself properly. I "have no > idea" what the CEL files contain so, the idea is to analyze the microarrays > using, the FreqB, LRR and genotypes. Some of them can are tumoral but I > can't know. I will use the genotype to classify the probes in AA, Ab and BB > in order to study the FreqB compared with LRR and use an external program > called MAD. But do you agree with me that it does not make sense to classify a SNP into (AA, AB, BB), i.e. call the genotype, if the SNP is for instance A, ABB, AAABB, or even worse a mixture of, say, 10% A, 38.5% ABB and 40.1% AAABB and the rest being the normal AB? So, I still argue that genotypes will only make sense for SNPs that you know are normal. If you don't know which samples are normal and which are tumors you will never know which SNPs/genotype calls you can trust, which to me makes the (artifical) genotype calls useless. Although I still haven't seen one, I'm all ear for a good argument for where it makes sense to call genotypes in a tumor. I'm just trying to safe you from wasting your time going down the wrong path. Could you provide a reference to "MAD" - never heard of it. > > So, you said CRLMM is not implemented for GenomeWideSNP 6.0, may I can > contribute implementing it? Certainly, that would be great and most appreciated. Just a heads up, it's more than a standard programming task. It requires diving into the oligo::crlmm() code and its algorithm to find out which modules can be reused and which needs to be ported. The two CrlmmModel.R and CrlmmModel.EXT.R in aroma.affymetrix/R/ would serve as a good start/template: https://r-forge.r-project.org/scm/viewvc.php/pkg/aroma.affymetrix/R/CrlmmModel.R?view=markup&root=aroma-dots https://r-forge.r-project.org/scm/viewvc.php/pkg/aroma.affymetrix/R/CrlmmModel.EXT.R?view=markup&root=aroma-dots If you look inside oligo::crlmm() you see that it itself takes two separate paths depending whether the chip type is (a) Mapping50K_(Hind|Xba)240 and Mapping250K_(Nsp|Sty) [which is ported to aroma.affymetrix], or (b) GenomeWideSNP_(5|6) [which is not ported]. In other words, it's the internal oligo:::genotypeOne() that needs to be ported. > > Anyway, thank you to share with us the aroma.affymetrix suite. You're welcome - hopefully it makes everyday science a bit easier. /Henrik > > > El viernes, 22 de marzo de 2013 19:31:10 UTC+1, Henrik Bengtsson escribió: >> >> Hi Carles, >> >> the quick answer it that aroma.affymetrix only implements the CRLMM >> method for the 100K (Mapping50K_Xba142 and Mapping50K_Hind142) and >> 500K (Mapping250K_Nsp and Mapping250K_Sty) chip types. For newer >> methods you need to turn to the Bioconductor 'oligo' package. >> >> However, what are you going to use the genotypes for? I'm asking >> because it is rather common, and according to me incorrect, to try to >> call genotypes in tumor samples. Genotypes are really only defined in >> normal/germline genomes and most (all?) genotype methods assume that >> the samples are such. Calling "genotypes" in tumors is rather a >> problem of inferring parent-specific CNs (PSCNs) - not at the >> SNP-by-SNP level but in segments along the genome. Contrary to normal >> PSCNs ("genotypes"), tumor PSCNs may not take discrete levels due >> clonality and normal contamination. In other words, if you do indeed >> have tumors, it does not make sense to use CRLMM on them. Instead you >> want to to PSCN segmentation/calling. >> >> Hope this helps >> >> Henrik >> >> >> On Fri, Mar 22, 2013 at 7:47 AM, Carles Hernández >> wrote: >> > Good afternoon, >> > >> > I am trying to analyse a set of CEL files from Affymetrix GenomeWideSNP >> > 6.0 >> > and get its LRR, FreqB and genotype (for all individuals and for all >> > chromosomes). >> > >> > I have started with the vignettes "CRMA (v1): Total copy number analysis >> > using CRMA v1 (10K, 100K, 500K)" and "CRMA (v2): Estimation of total >> > copy >> > numbers using the CRMA v2 method (10K-CytoScanHD)" since I am new in >> > this >> > world of microarrays analysis. >> > >> > But I didn't fine any way to retrieve the genotype I moved to "CRLMM >> > genotyping (100K and 500K)". >> > >> > So, from both methods I can get the LRR and FreqB with extactCNT of with >> > extractTotalAndFraqB but only from the second one (CRLMM) I can use the >> > extractGenotypes (becouse the chiptype's crlmm model is required). On >> > the >> > other hand when I try to create the crlmm model for GenomeWideSNP 6.0 >> > the >> > following error succeed: >> > >> > >> > Exception: Cannot fit CRLMM model: Model fitting for this chip type is >> > not >> > supported/implemented: GenomeWideSNP_6 >> > at #02. CrlmmModel(ces, tags = "*,oligo") >> > - CrlmmModel() is in e
Re: [aroma.affymetrix] genotyping crlmm genomewidesnp 6.0
Good morning, First of all, thanks for answering so fast. Its really helpful to be able to talk with the main creator of the library. Going back to the topic, sorry I didn't express myself properly. I "have no idea" what the CEL files contain so, the idea is to analyze the microarrays using, the FreqB, LRR and genotypes. Some of them can are tumoral but I can't know. I will use the genotype to classify the probes in AA, Ab and BB in order to study the FreqB compared with LRR and use an external program called MAD. So, you said CRLMM is not implemented for GenomeWideSNP 6.0, may I can contribute implementing it? Anyway, thank you to share with us the aroma.affymetrix suite. El viernes, 22 de marzo de 2013 19:31:10 UTC+1, Henrik Bengtsson escribió: > > Hi Carles, > > the quick answer it that aroma.affymetrix only implements the CRLMM > method for the 100K (Mapping50K_Xba142 and Mapping50K_Hind142) and > 500K (Mapping250K_Nsp and Mapping250K_Sty) chip types. For newer > methods you need to turn to the Bioconductor 'oligo' package. > > However, what are you going to use the genotypes for? I'm asking > because it is rather common, and according to me incorrect, to try to > call genotypes in tumor samples. Genotypes are really only defined in > normal/germline genomes and most (all?) genotype methods assume that > the samples are such. Calling "genotypes" in tumors is rather a > problem of inferring parent-specific CNs (PSCNs) - not at the > SNP-by-SNP level but in segments along the genome. Contrary to normal > PSCNs ("genotypes"), tumor PSCNs may not take discrete levels due > clonality and normal contamination. In other words, if you do indeed > have tumors, it does not make sense to use CRLMM on them. Instead you > want to to PSCN segmentation/calling. > > Hope this helps > > Henrik > > > On Fri, Mar 22, 2013 at 7:47 AM, Carles Hernández > > > wrote: > > Good afternoon, > > > > I am trying to analyse a set of CEL files from Affymetrix GenomeWideSNP > 6.0 > > and get its LRR, FreqB and genotype (for all individuals and for all > > chromosomes). > > > > I have started with the vignettes "CRMA (v1): Total copy number analysis > > using CRMA v1 (10K, 100K, 500K)" and "CRMA (v2): Estimation of total > copy > > numbers using the CRMA v2 method (10K-CytoScanHD)" since I am new in > this > > world of microarrays analysis. > > > > But I didn't fine any way to retrieve the genotype I moved to "CRLMM > > genotyping (100K and 500K)". > > > > So, from both methods I can get the LRR and FreqB with extactCNT of with > > extractTotalAndFraqB but only from the second one (CRLMM) I can use the > > extractGenotypes (becouse the chiptype's crlmm model is required). On > the > > other hand when I try to create the crlmm model for GenomeWideSNP 6.0 > the > > following error succeed: > > > > > > Exception: Cannot fit CRLMM model: Model fitting for this chip type is > not > > supported/implemented: GenomeWideSNP_6 > > at #02. CrlmmModel(ces, tags = "*,oligo") > > - CrlmmModel() is in environment 'aroma.affymetrix' > > at #01. process_dataset("GenomeWideSNP_6", "gal", verbose = TRUE) > > - process_dataset() is in environment 'R_GlobalEnv' > > Error: Cannot fit CRLMM model: Model fitting for this chip type is not > > supported/implemented: GenomeWideSNP_6 > > > > > > So... Am I doing something wrong? If no, is there some way to get the > full > > set of data I need (sample's name, sample's position, chromosome, LRR, > FraqB > > and genotype) using a single method? > > > > My full code-snippet: > > > > library( 'aroma.affymetrix' ) > > > > > > write_table <- function( dataset, file_name ) { > > [...] > > } > > > > process_dataset <- function( dataset_name chip_type ) { > > cdf <- AffymetrixCdfFile$byChipType( chip_type ); > > csR <- AffymetrixCelSet$byName( dataset_name, cdf=cdf ); > > ces <- justSNPRMA( csR, normalizeToHapmap=TRUE, returnESet=FALSE ); > > crlmm <- CrlmmModel( ces, tags="*,oligo" ); > > units <- fit( crlmm, ram="oligo" ); > > callSet <- getCallSet( crlmm ); > > > > > > gi <- getGenomeInformation( cdf ); > > > > > > for( array in 1:length( csR ) ) { > > ds <- NULL; > > ce <- getFile( ces, array ); > > for( chr in chr_list ) { > > chrunits <- getUnitsOnChromosome( gi, chromosome=chr ); > > chrnames <- getUnitNames( cdf, units=chrunits ) > > pos <- getPositions( gi, units=chrunits ); # / 1e6; > > cf <- getFile( callSet, array ); > > calls <- extractGenotypes( cf, units=chrunits ); > > dta <- extractTotalAndFreqB( ce, units=chrunits ); > > theta <- dta[,"total"]; > > > > ceR <- getAverageFile( ces ); > > dataR <- extractTotalAndFreqB( ceR, units=chrunits ); > >
Re: [aroma.affymetrix] genotyping crlmm genomewidesnp 6.0
Hi Carles, the quick answer it that aroma.affymetrix only implements the CRLMM method for the 100K (Mapping50K_Xba142 and Mapping50K_Hind142) and 500K (Mapping250K_Nsp and Mapping250K_Sty) chip types. For newer methods you need to turn to the Bioconductor 'oligo' package. However, what are you going to use the genotypes for? I'm asking because it is rather common, and according to me incorrect, to try to call genotypes in tumor samples. Genotypes are really only defined in normal/germline genomes and most (all?) genotype methods assume that the samples are such. Calling "genotypes" in tumors is rather a problem of inferring parent-specific CNs (PSCNs) - not at the SNP-by-SNP level but in segments along the genome. Contrary to normal PSCNs ("genotypes"), tumor PSCNs may not take discrete levels due clonality and normal contamination. In other words, if you do indeed have tumors, it does not make sense to use CRLMM on them. Instead you want to to PSCN segmentation/calling. Hope this helps Henrik On Fri, Mar 22, 2013 at 7:47 AM, Carles Hernández wrote: > Good afternoon, > > I am trying to analyse a set of CEL files from Affymetrix GenomeWideSNP 6.0 > and get its LRR, FreqB and genotype (for all individuals and for all > chromosomes). > > I have started with the vignettes "CRMA (v1): Total copy number analysis > using CRMA v1 (10K, 100K, 500K)" and "CRMA (v2): Estimation of total copy > numbers using the CRMA v2 method (10K-CytoScanHD)" since I am new in this > world of microarrays analysis. > > But I didn't fine any way to retrieve the genotype I moved to "CRLMM > genotyping (100K and 500K)". > > So, from both methods I can get the LRR and FreqB with extactCNT of with > extractTotalAndFraqB but only from the second one (CRLMM) I can use the > extractGenotypes (becouse the chiptype's crlmm model is required). On the > other hand when I try to create the crlmm model for GenomeWideSNP 6.0 the > following error succeed: > > > Exception: Cannot fit CRLMM model: Model fitting for this chip type is not > supported/implemented: GenomeWideSNP_6 > at #02. CrlmmModel(ces, tags = "*,oligo") > - CrlmmModel() is in environment 'aroma.affymetrix' > at #01. process_dataset("GenomeWideSNP_6", "gal", verbose = TRUE) > - process_dataset() is in environment 'R_GlobalEnv' > Error: Cannot fit CRLMM model: Model fitting for this chip type is not > supported/implemented: GenomeWideSNP_6 > > > So... Am I doing something wrong? If no, is there some way to get the full > set of data I need (sample's name, sample's position, chromosome, LRR, FraqB > and genotype) using a single method? > > My full code-snippet: > > library( 'aroma.affymetrix' ) > > > write_table <- function( dataset, file_name ) { > [...] > } > > process_dataset <- function( dataset_name chip_type ) { > cdf <- AffymetrixCdfFile$byChipType( chip_type ); > csR <- AffymetrixCelSet$byName( dataset_name, cdf=cdf ); > ces <- justSNPRMA( csR, normalizeToHapmap=TRUE, returnESet=FALSE ); > crlmm <- CrlmmModel( ces, tags="*,oligo" ); > units <- fit( crlmm, ram="oligo" ); > callSet <- getCallSet( crlmm ); > > > gi <- getGenomeInformation( cdf ); > > > for( array in 1:length( csR ) ) { > ds <- NULL; > ce <- getFile( ces, array ); > for( chr in chr_list ) { > chrunits <- getUnitsOnChromosome( gi, chromosome=chr ); > chrnames <- getUnitNames( cdf, units=chrunits ) > pos <- getPositions( gi, units=chrunits ); # / 1e6; > cf <- getFile( callSet, array ); > calls <- extractGenotypes( cf, units=chrunits ); > dta <- extractTotalAndFreqB( ce, units=chrunits ); > theta <- dta[,"total"]; > > ceR <- getAverageFile( ces ); > dataR <- extractTotalAndFreqB( ceR, units=chrunits ); > thetaR <- dataR[,"total"]; > > l2r <- log2(theta/thetaR); > ds <- add_to_ds( chrnames, rep( chr, length( chrnames ) ), > pos, l2r, dta[,"FreqB"], calls ); > } > colnames( ds ) <- c( "Name", "Chr", "Position", "Log.R.Ratio", > "B.Allele.Freq", "GType" ); > write_table( ds, paste0( getName( ce ), ".txt" ) ) > } > } > } > > process_dataset( "GenomeWideSNP_6", "gal" ) > > -- > -- > When reporting problems on aroma.affymetrix, make sure 1) to run the latest > version of the package, 2) to report the output of sessionInfo() and > traceback(), and 3) to post a complete code example. > > > You received this message because you are subscribed to the Google Groups > "aroma.affymetrix" group with website http://www.aroma-project.org/. > To post to this group, send email to aroma-affymetrix@googlegroups.com > To unsubscribe and other options, go to http://www.aroma-project.org/forum/ > > --- > You received this message because you are subscribed to the Google Groups > "aroma.affymetrix"