Re: [aroma.affymetrix] genotyping crlmm genomewidesnp 6.0
Good afternoon, First of all offer my apologies for the delay of this response. El sábado, 23 de marzo de 2013 22:28:52 UTC+1, Henrik Bengtsson escribió: Hi. On Sat, Mar 23, 2013 at 4:00 AM, Carles Hernández kurag...@gmail.comjavascript: wrote: Good morning, First of all, thanks for answering so fast. Its really helpful to be able to talk with the main creator of the library. Going back to the topic, sorry I didn't express myself properly. I have no idea what the CEL files contain so, the idea is to analyze the microarrays using, the FreqB, LRR and genotypes. Some of them can are tumoral but I can't know. I will use the genotype to classify the probes in AA, Ab and BB in order to study the FreqB compared with LRR and use an external program called MAD. But do you agree with me that it does not make sense to classify a SNP into (AA, AB, BB), i.e. call the genotype, if the SNP is for instance A, ABB, AAABB, or even worse a mixture of, say, 10% A, 38.5% ABB and 40.1% AAABB and the rest being the normal AB? So, I still argue that genotypes will only make sense for SNPs that you know are normal. If you don't know which samples are normal and which are tumors you will never know which SNPs/genotype calls you can trust, which to me makes the (artifical) genotype calls useless. Although I still haven't seen one, I'm all ear for a good argument for where it makes sense to call genotypes in a tumor. I'm just trying to safe you from wasting your time going down the wrong path. Yes, I agree with that but in fact I want a baf estimation and for that I want to use CRLMM, which also predicts the genotype, but it is not ready for GenomeWideSNP 6 so use the implementation of CRMAv2 which predicts baf pretty well it may be a solution. Could you provide a reference to MAD - never heard of it. Here you can get some information related to MAD: - http://www.biomedcentral.com/1471-2105/12/166 - http://www.creal.cat/jrgonzalez/software.htm#ancla-MAD So, you said CRLMM is not implemented for GenomeWideSNP 6.0, may I can contribute implementing it? Certainly, that would be great and most appreciated. Just a heads up, it's more than a standard programming task. It requires diving into the oligo::crlmm() code and its algorithm to find out which modules can be reused and which needs to be ported. The two CrlmmModel.R and CrlmmModel.EXT.R in aroma.affymetrix/R/ would serve as a good start/template: https://r-forge.r-project.org/scm/viewvc.php/pkg/aroma.affymetrix/R/CrlmmModel.R?view=markuproot=aroma-dots https://r-forge.r-project.org/scm/viewvc.php/pkg/aroma.affymetrix/R/CrlmmModel.EXT.R?view=markuproot=aroma-dots If you look inside oligo::crlmm() you see that it itself takes two separate paths depending whether the chip type is (a) Mapping50K_(Hind|Xba)240 and Mapping250K_(Nsp|Sty) [which is ported to aroma.affymetrix], or (b) GenomeWideSNP_(5|6) [which is not ported]. In other words, it's the internal oligo:::genotypeOne() that needs to be ported. Actually I am battling with clrmm, oligo and oligoClasses to manage my GenomeWideSNP cel files. My prior is to finish this analysis but may be I will take a hand on this porting, not sure but in mind. Anyway, thank you to share with us the aroma.affymetrix suite. You're welcome - hopefully it makes everyday science a bit easier. /Henrik Lots of thanks for you answers. Carles PS\ Some consideration to apply CRLMM to Affymetrix Axiom and Affymetrix Axiom Exome arrays? El viernes, 22 de marzo de 2013 19:31:10 UTC+1, Henrik Bengtsson escribió: Hi Carles, the quick answer it that aroma.affymetrix only implements the CRLMM method for the 100K (Mapping50K_Xba142 and Mapping50K_Hind142) and 500K (Mapping250K_Nsp and Mapping250K_Sty) chip types. For newer methods you need to turn to the Bioconductor 'oligo' package. However, what are you going to use the genotypes for? I'm asking because it is rather common, and according to me incorrect, to try to call genotypes in tumor samples. Genotypes are really only defined in normal/germline genomes and most (all?) genotype methods assume that the samples are such. Calling genotypes in tumors is rather a problem of inferring parent-specific CNs (PSCNs) - not at the SNP-by-SNP level but in segments along the genome. Contrary to normal PSCNs (genotypes), tumor PSCNs may not take discrete levels due clonality and normal contamination. In other words, if you do indeed have tumors, it does not make sense to use CRLMM on them. Instead you want to to PSCN segmentation/calling. Hope this helps Henrik On Fri, Mar 22, 2013 at 7:47 AM, Carles Hernández kurag...@gmail.com wrote: Good afternoon, I am trying to analyse a set of CEL files from Affymetrix GenomeWideSNP 6.0 and get its
Re: [aroma.affymetrix] genotyping crlmm genomewidesnp 6.0
Good morning, First of all, thanks for answering so fast. Its really helpful to be able to talk with the main creator of the library. Going back to the topic, sorry I didn't express myself properly. I have no idea what the CEL files contain so, the idea is to analyze the microarrays using, the FreqB, LRR and genotypes. Some of them can are tumoral but I can't know. I will use the genotype to classify the probes in AA, Ab and BB in order to study the FreqB compared with LRR and use an external program called MAD. So, you said CRLMM is not implemented for GenomeWideSNP 6.0, may I can contribute implementing it? Anyway, thank you to share with us the aroma.affymetrix suite. El viernes, 22 de marzo de 2013 19:31:10 UTC+1, Henrik Bengtsson escribió: Hi Carles, the quick answer it that aroma.affymetrix only implements the CRLMM method for the 100K (Mapping50K_Xba142 and Mapping50K_Hind142) and 500K (Mapping250K_Nsp and Mapping250K_Sty) chip types. For newer methods you need to turn to the Bioconductor 'oligo' package. However, what are you going to use the genotypes for? I'm asking because it is rather common, and according to me incorrect, to try to call genotypes in tumor samples. Genotypes are really only defined in normal/germline genomes and most (all?) genotype methods assume that the samples are such. Calling genotypes in tumors is rather a problem of inferring parent-specific CNs (PSCNs) - not at the SNP-by-SNP level but in segments along the genome. Contrary to normal PSCNs (genotypes), tumor PSCNs may not take discrete levels due clonality and normal contamination. In other words, if you do indeed have tumors, it does not make sense to use CRLMM on them. Instead you want to to PSCN segmentation/calling. Hope this helps Henrik On Fri, Mar 22, 2013 at 7:47 AM, Carles Hernández kurag...@gmail.comjavascript: wrote: Good afternoon, I am trying to analyse a set of CEL files from Affymetrix GenomeWideSNP 6.0 and get its LRR, FreqB and genotype (for all individuals and for all chromosomes). I have started with the vignettes CRMA (v1): Total copy number analysis using CRMA v1 (10K, 100K, 500K) and CRMA (v2): Estimation of total copy numbers using the CRMA v2 method (10K-CytoScanHD) since I am new in this world of microarrays analysis. But I didn't fine any way to retrieve the genotype I moved to CRLMM genotyping (100K and 500K). So, from both methods I can get the LRR and FreqB with extactCNT of with extractTotalAndFraqB but only from the second one (CRLMM) I can use the extractGenotypes (becouse the chiptype's crlmm model is required). On the other hand when I try to create the crlmm model for GenomeWideSNP 6.0 the following error succeed: Exception: Cannot fit CRLMM model: Model fitting for this chip type is not supported/implemented: GenomeWideSNP_6 at #02. CrlmmModel(ces, tags = *,oligo) - CrlmmModel() is in environment 'aroma.affymetrix' at #01. process_dataset(GenomeWideSNP_6, gal, verbose = TRUE) - process_dataset() is in environment 'R_GlobalEnv' Error: Cannot fit CRLMM model: Model fitting for this chip type is not supported/implemented: GenomeWideSNP_6 So... Am I doing something wrong? If no, is there some way to get the full set of data I need (sample's name, sample's position, chromosome, LRR, FraqB and genotype) using a single method? My full code-snippet: library( 'aroma.affymetrix' ) write_table - function( dataset, file_name ) { [...] } process_dataset - function( dataset_name chip_type ) { cdf - AffymetrixCdfFile$byChipType( chip_type ); csR - AffymetrixCelSet$byName( dataset_name, cdf=cdf ); ces - justSNPRMA( csR, normalizeToHapmap=TRUE, returnESet=FALSE ); crlmm - CrlmmModel( ces, tags=*,oligo ); units - fit( crlmm, ram=oligo ); callSet - getCallSet( crlmm ); gi - getGenomeInformation( cdf ); for( array in 1:length( csR ) ) { ds - NULL; ce - getFile( ces, array ); for( chr in chr_list ) { chrunits - getUnitsOnChromosome( gi, chromosome=chr ); chrnames - getUnitNames( cdf, units=chrunits ) pos - getPositions( gi, units=chrunits ); # / 1e6; cf - getFile( callSet, array ); calls - extractGenotypes( cf, units=chrunits ); dta - extractTotalAndFreqB( ce, units=chrunits ); theta - dta[,total]; ceR - getAverageFile( ces ); dataR - extractTotalAndFreqB( ceR, units=chrunits ); thetaR - dataR[,total]; l2r - log2(theta/thetaR); ds - add_to_ds( chrnames, rep( chr, length( chrnames ) ), pos, l2r, dta[,FreqB], calls
Re: [aroma.affymetrix] genotyping crlmm genomewidesnp 6.0
Hi. On Sat, Mar 23, 2013 at 4:00 AM, Carles Hernández kuragari...@gmail.com wrote: Good morning, First of all, thanks for answering so fast. Its really helpful to be able to talk with the main creator of the library. Going back to the topic, sorry I didn't express myself properly. I have no idea what the CEL files contain so, the idea is to analyze the microarrays using, the FreqB, LRR and genotypes. Some of them can are tumoral but I can't know. I will use the genotype to classify the probes in AA, Ab and BB in order to study the FreqB compared with LRR and use an external program called MAD. But do you agree with me that it does not make sense to classify a SNP into (AA, AB, BB), i.e. call the genotype, if the SNP is for instance A, ABB, AAABB, or even worse a mixture of, say, 10% A, 38.5% ABB and 40.1% AAABB and the rest being the normal AB? So, I still argue that genotypes will only make sense for SNPs that you know are normal. If you don't know which samples are normal and which are tumors you will never know which SNPs/genotype calls you can trust, which to me makes the (artifical) genotype calls useless. Although I still haven't seen one, I'm all ear for a good argument for where it makes sense to call genotypes in a tumor. I'm just trying to safe you from wasting your time going down the wrong path. Could you provide a reference to MAD - never heard of it. So, you said CRLMM is not implemented for GenomeWideSNP 6.0, may I can contribute implementing it? Certainly, that would be great and most appreciated. Just a heads up, it's more than a standard programming task. It requires diving into the oligo::crlmm() code and its algorithm to find out which modules can be reused and which needs to be ported. The two CrlmmModel.R and CrlmmModel.EXT.R in aroma.affymetrix/R/ would serve as a good start/template: https://r-forge.r-project.org/scm/viewvc.php/pkg/aroma.affymetrix/R/CrlmmModel.R?view=markuproot=aroma-dots https://r-forge.r-project.org/scm/viewvc.php/pkg/aroma.affymetrix/R/CrlmmModel.EXT.R?view=markuproot=aroma-dots If you look inside oligo::crlmm() you see that it itself takes two separate paths depending whether the chip type is (a) Mapping50K_(Hind|Xba)240 and Mapping250K_(Nsp|Sty) [which is ported to aroma.affymetrix], or (b) GenomeWideSNP_(5|6) [which is not ported]. In other words, it's the internal oligo:::genotypeOne() that needs to be ported. Anyway, thank you to share with us the aroma.affymetrix suite. You're welcome - hopefully it makes everyday science a bit easier. /Henrik El viernes, 22 de marzo de 2013 19:31:10 UTC+1, Henrik Bengtsson escribió: Hi Carles, the quick answer it that aroma.affymetrix only implements the CRLMM method for the 100K (Mapping50K_Xba142 and Mapping50K_Hind142) and 500K (Mapping250K_Nsp and Mapping250K_Sty) chip types. For newer methods you need to turn to the Bioconductor 'oligo' package. However, what are you going to use the genotypes for? I'm asking because it is rather common, and according to me incorrect, to try to call genotypes in tumor samples. Genotypes are really only defined in normal/germline genomes and most (all?) genotype methods assume that the samples are such. Calling genotypes in tumors is rather a problem of inferring parent-specific CNs (PSCNs) - not at the SNP-by-SNP level but in segments along the genome. Contrary to normal PSCNs (genotypes), tumor PSCNs may not take discrete levels due clonality and normal contamination. In other words, if you do indeed have tumors, it does not make sense to use CRLMM on them. Instead you want to to PSCN segmentation/calling. Hope this helps Henrik On Fri, Mar 22, 2013 at 7:47 AM, Carles Hernández kurag...@gmail.com wrote: Good afternoon, I am trying to analyse a set of CEL files from Affymetrix GenomeWideSNP 6.0 and get its LRR, FreqB and genotype (for all individuals and for all chromosomes). I have started with the vignettes CRMA (v1): Total copy number analysis using CRMA v1 (10K, 100K, 500K) and CRMA (v2): Estimation of total copy numbers using the CRMA v2 method (10K-CytoScanHD) since I am new in this world of microarrays analysis. But I didn't fine any way to retrieve the genotype I moved to CRLMM genotyping (100K and 500K). So, from both methods I can get the LRR and FreqB with extactCNT of with extractTotalAndFraqB but only from the second one (CRLMM) I can use the extractGenotypes (becouse the chiptype's crlmm model is required). On the other hand when I try to create the crlmm model for GenomeWideSNP 6.0 the following error succeed: Exception: Cannot fit CRLMM model: Model fitting for this chip type is not supported/implemented: GenomeWideSNP_6 at #02. CrlmmModel(ces, tags = *,oligo) - CrlmmModel() is in environment 'aroma.affymetrix' at #01. process_dataset(GenomeWideSNP_6, gal, verbose = TRUE) - process_dataset() is in
Re: [aroma.affymetrix] genotyping crlmm genomewidesnp 6.0
Hi Carles, the quick answer it that aroma.affymetrix only implements the CRLMM method for the 100K (Mapping50K_Xba142 and Mapping50K_Hind142) and 500K (Mapping250K_Nsp and Mapping250K_Sty) chip types. For newer methods you need to turn to the Bioconductor 'oligo' package. However, what are you going to use the genotypes for? I'm asking because it is rather common, and according to me incorrect, to try to call genotypes in tumor samples. Genotypes are really only defined in normal/germline genomes and most (all?) genotype methods assume that the samples are such. Calling genotypes in tumors is rather a problem of inferring parent-specific CNs (PSCNs) - not at the SNP-by-SNP level but in segments along the genome. Contrary to normal PSCNs (genotypes), tumor PSCNs may not take discrete levels due clonality and normal contamination. In other words, if you do indeed have tumors, it does not make sense to use CRLMM on them. Instead you want to to PSCN segmentation/calling. Hope this helps Henrik On Fri, Mar 22, 2013 at 7:47 AM, Carles Hernández kuragari...@gmail.com wrote: Good afternoon, I am trying to analyse a set of CEL files from Affymetrix GenomeWideSNP 6.0 and get its LRR, FreqB and genotype (for all individuals and for all chromosomes). I have started with the vignettes CRMA (v1): Total copy number analysis using CRMA v1 (10K, 100K, 500K) and CRMA (v2): Estimation of total copy numbers using the CRMA v2 method (10K-CytoScanHD) since I am new in this world of microarrays analysis. But I didn't fine any way to retrieve the genotype I moved to CRLMM genotyping (100K and 500K). So, from both methods I can get the LRR and FreqB with extactCNT of with extractTotalAndFraqB but only from the second one (CRLMM) I can use the extractGenotypes (becouse the chiptype's crlmm model is required). On the other hand when I try to create the crlmm model for GenomeWideSNP 6.0 the following error succeed: Exception: Cannot fit CRLMM model: Model fitting for this chip type is not supported/implemented: GenomeWideSNP_6 at #02. CrlmmModel(ces, tags = *,oligo) - CrlmmModel() is in environment 'aroma.affymetrix' at #01. process_dataset(GenomeWideSNP_6, gal, verbose = TRUE) - process_dataset() is in environment 'R_GlobalEnv' Error: Cannot fit CRLMM model: Model fitting for this chip type is not supported/implemented: GenomeWideSNP_6 So... Am I doing something wrong? If no, is there some way to get the full set of data I need (sample's name, sample's position, chromosome, LRR, FraqB and genotype) using a single method? My full code-snippet: library( 'aroma.affymetrix' ) write_table - function( dataset, file_name ) { [...] } process_dataset - function( dataset_name chip_type ) { cdf - AffymetrixCdfFile$byChipType( chip_type ); csR - AffymetrixCelSet$byName( dataset_name, cdf=cdf ); ces - justSNPRMA( csR, normalizeToHapmap=TRUE, returnESet=FALSE ); crlmm - CrlmmModel( ces, tags=*,oligo ); units - fit( crlmm, ram=oligo ); callSet - getCallSet( crlmm ); gi - getGenomeInformation( cdf ); for( array in 1:length( csR ) ) { ds - NULL; ce - getFile( ces, array ); for( chr in chr_list ) { chrunits - getUnitsOnChromosome( gi, chromosome=chr ); chrnames - getUnitNames( cdf, units=chrunits ) pos - getPositions( gi, units=chrunits ); # / 1e6; cf - getFile( callSet, array ); calls - extractGenotypes( cf, units=chrunits ); dta - extractTotalAndFreqB( ce, units=chrunits ); theta - dta[,total]; ceR - getAverageFile( ces ); dataR - extractTotalAndFreqB( ceR, units=chrunits ); thetaR - dataR[,total]; l2r - log2(theta/thetaR); ds - add_to_ds( chrnames, rep( chr, length( chrnames ) ), pos, l2r, dta[,FreqB], calls ); } colnames( ds ) - c( Name, Chr, Position, Log.R.Ratio, B.Allele.Freq, GType ); write_table( ds, paste0( getName( ce ), .txt ) ) } } } process_dataset( GenomeWideSNP_6, gal ) -- -- When reporting problems on aroma.affymetrix, make sure 1) to run the latest version of the package, 2) to report the output of sessionInfo() and traceback(), and 3) to post a complete code example. You received this message because you are subscribed to the Google Groups aroma.affymetrix group with website http://www.aroma-project.org/. To post to this group, send email to aroma-affymetrix@googlegroups.com To unsubscribe and other options, go to http://www.aroma-project.org/forum/ --- You received this message because you are subscribed to the Google Groups aroma.affymetrix group. To unsubscribe from this group and stop receiving emails from it, send an email to aroma-affymetrix+unsubscr...@googlegroups.com. For more