Dear Mayte

I have some ideas about what might be going awry here. Thank you for providing 
more commands and output, but from what you have provided some things are still 
ambiguous, so I will have to make a few guesses as to what is happening.

In brief, though, producing smear and mean-variance plots for paired or other 
complex experimental designs is problematic (theoretically and computationally) 
and the plotSmear() and plotMeanVar() functions in edgeR are not currently set 
up to handle paired data.

Some more comments below.


On Nov 5, 2010, at 5:30 AM, Mayte Suarez-Farinas wrote:

> Dear Gordon:
> 
> you were right, but I was not wrong.
> (I did the analysis for common and tagwise dispersion estimates.
> the tagwise ends in tgw.
> Anyway here is the code
> 
> 
> PScounts <-readDGE(files.pheno)

I cannot know what is in files.pheno, but if it is not a targets file such as 
recommended in the User's Guide, then PScounts will be a DGEList without an 
experimental group defined (i.e. PScounts$samples$group is NULL). 

It would be useful here to see the output from PScounts$samples so we have some 
idea about the design of the experiment and what is and is not defined.

> d.PS <- calcNormFactors(PScounts)
> dr.PS <- d.PS[rowSums(d.PS$counts) > 9, ]
> dr.PS.tgw <- estimateCRDisp(dr.PS, design, tagwise = TRUE, prior.n = 10)
> glmfit.PS.tgw <- glmFit(dr.PS.tgw, design, dispersion = dr.PS.tgw 
> $CR.tagwise.dispersion)
> lrt.PS.tgw <- glmLRT(dr.PS.tgw, glmfit.PS.tgw)
> 
> 
> plotSmear(d.PS, panel.first=grid(), smooth.scatter=FALSE)
> Error: length(pair) == 2 is not TRUE

Following from the output above, we have d.PS$samples$group is NULL. 
plotSmear() tries to produce an MA-plot using two groups defined by 
d.PS$samples$group and since this is NULL, it produces this (admittedly 
noninformative) error. I will add a check to make this clearer.

plotSmear() is designed to deal with an experiment with multiple groups in a 
one-way layout, so is not really appropriate for paired data.


> 
>> str(d.PS)
> Formal class 'DGEList' [package "edgeR"] with 1 slots
>   ..@ .Data:List of 2
>   .. ..$ :'data.frame':       6 obs. of  5 variables:
>   .. .. ..$ files       : Factor w/ 6 levels "LS252.counts.txt",..: 1  
> 2 3 4 5 6
>   .. .. ..$ Tissue      : Factor w/ 2 levels "NL","LS": 2 2 2 1 1 1
>   .. .. ..$ Patient     : Factor w/ 3 levels "25","28","29": 1 2 3 1  
> 2 3
>   .. .. ..$ lib.size    : num [1:6] 23067191 20684675 19881245  
> 19665929 22938039 ...
>   .. .. ..$ norm.factors: num [1:6] 0.95 0.991 1.033 0.954 1.071 ...
>   .. ..$ : num [1:34487, 1:6] 0 0 2 0 100 0 0 4 0 35 ...
>   .. .. ..- attr(*, "dimnames")=List of 2
>   .. .. .. ..$ : chr [1:34487] "NM_000015" "NM_000016" "NM_000017"  
> "NM_000018" ...
>   .. .. .. ..$ : chr [1:6] "LS-25" "LS-28" "LS-29" "NL-25" ...
> 
> 
> plotMeanVar(dr.PS.tgw, meanvar=meanvar, show.tagwise.vars=TRUE,  
> NBline=TRUE, dispersion.method="coxreid")

Unfortunately, I can't see the meanvar object defined anywhere, so it is 
difficult to say what causes this error. All I can tell from the error message 
is that  meanvar$means is different in length to 
dr.PS.tgw$CR.tagwise.dispersion. This could arise if you created the meanvar 
object from a DGEList before filtering, then did your analysis on a filtered 
object, e.g.:
meanvar <- plotMeanVar(d.unfiltered)
d.filtered <- d.unfiltered[ rowSums(d.unfiltered) > 9, ]
plotMeanVar(d.filtered, meanvar, ... )

Beyond that, it is difficult to say.

However, it is problematic to use plotMeanVar for a paired design. The function 
computes pooled sample variances across groups - if these are undefined, or 
cannot be defined appropriately for the experimental design, then it is not 
meaningful to produce such a mean-variance plot.

In looking at paired data myself (6 samples from three individuals, one cancer 
and one normal sample from each patient) I have found it interesting to define 
a group variable (i.e d$samples$group) as "cancer" or "normal" for each sample 
and produce a mean-variance plot using those defined groups. However, this 
ignores the paired nature of the data, and any such plot should be interpreted 
with caution. I have found such plots useful, but I hesitate to recommend that 
approach generally. The same caveats would apply to taking this approach to 
producing an MA-plot (smear plot) for paired data.

In summary, all of these visualizations that work very nicely for one-way 
experimental layouts are much more challenging to produce for general 
experimental designs, although we are thinking about ways in which to do it 
appropriately.

Hope that helps with your analysis.

Best regards
Davis



> Error in xy.coords(x, y, xlabel, ylabel, log) :
>   'x' and 'y' lengths differ
> In addition: Warning messages:
> 1: In meanvar$means^2 * tagwise.dispersion :
>   longer object length is not a multiple of shorter object length
> 2: In meanvar$means + meanvar$means^2 * tagwise.dispersion :
>   longer object length is not a multiple of shorter object length
>> str(dr.PS.tgw)
> Formal class 'DGEList' [package "edgeR"] with 1 slots
>   ..@ .Data:List of 5
>   .. ..$ :'data.frame':       6 obs. of  5 variables:
>   .. .. ..$ files       : Factor w/ 6 levels "LS252.counts.txt",..: 1  
> 2 3 4 5 6
>   .. .. ..$ Tissue      : Factor w/ 2 levels "NL","LS": 2 2 2 1 1 1
>   .. .. ..$ Patient     : Factor w/ 3 levels "25","28","29": 1 2 3 1  
> 2 3
>   .. .. ..$ lib.size    : num [1:6] 23067191 20684675 19881245  
> 19665929 22938039 ...
>   .. .. ..$ norm.factors: num [1:6] 0.95 0.991 1.033 0.954 1.071 ...
>   .. ..$ : num [1:12373, 1:6] 2 100 4 35 11 ...
>   .. .. ..- attr(*, "dimnames")=List of 2
>   .. .. .. ..$ : chr [1:12373] "NM_000017" "NM_000019" "NM_000022"  
> "NM_000024" ...
>   .. .. .. ..$ : chr [1:6] "LS-25" "LS-28" "LS-29" "NL-25" ...
>   .. ..$ : num [1:6, 1:4] 1 1 1 1 1 1 0 1 0 0 ...
>   .. .. ..- attr(*, "dimnames")=List of 2
>   .. .. .. ..$ : chr [1:6] "1" "2" "3" "4" ...
>   .. .. .. ..$ : chr [1:4] "(Intercept)" "Patient28" "Patient29"  
> "TissueLS"
>   .. .. ..- attr(*, "assign")= int [1:4] 0 1 1 2
>   .. .. ..- attr(*, "contrasts")=List of 2
>   .. .. .. ..$ Patient: chr "contr.treatment"
>   .. .. .. ..$ Tissue : chr "contr.treatment"
>   .. ..$ : num 0.0374
>   .. ..$ : num [1:12373] 0.0369 0.0303 0.036 0.0301 0.0365 ...
>> 
> 
> Mayte Suarez-Farinas
> Research Associate, The Rockefeller University
> Biostatistician, The Rockefeller University Hospital
> 1230 York Ave, Box 178,
> New York, NY, 10065
> +1(212) 327-8213
> 
> 
> 
> 
> 
> On Nov 3, 2010, at 7:29 PM, Gordon K Smyth wrote:
> 
>> Dear Mayte,
>> 
>>> Date: Wed, 3 Nov 2010 10:56:42 -0400
>>> From: Mayte Suarez-Farinas <[email protected]>
>>> To: [email protected]
>>> Subject: [Bioc-sig-seq] edgeR plots for paired samples
>>> Message-ID: <[email protected]>
>>> Content-Type: text/plain
>>> 
>>> Dear Bioc
>>> 
>>> I am using edgeR for paired samples and I am having difficulties with
>>> both plotSmear and plotMeanVar functions. I dont know if those  
>>> functions
>>> simply do not work on those cases
>>> 
>>> PScounts <-readDGE(files.pheno)
>>> d.PS <- calcNormFactors(PScounts)
>>> dr.PS <- d.PS[rowSums(d.PS$counts) > 9, ]
>>> dr.PS.c <- estimateCRDisp(dr.PS, design)
>>> glmfit.PS.c <- glmFit(dr.PS.c, design, dispersion = dr.PS.c
>>> $CR.common.dispersion)
>>> lrt.PS.c <- glmLRT(dr.PS.c, glmfit.PS.c)
>>> 
>>> plotMeanVar(dr.PS.tgw, meanvar=meanvar, show.tagwise.vars=TRUE,
>>> NBline=TRUE, dispersion.method="coxreid")
>>> Error in xy.coords(x, y, xlabel, ylabel, log) :
>>> 'x' and 'y' lengths differ
>>> In addition: Warning messages:
>>> 1: In meanvar$means^2 * tagwise.dispersion :
>>> longer object length is not a multiple of shorter object length
>>> 2: In meanvar$means + meanvar$means^2 * tagwise.dispersion :
>>> longer object length is not a multiple of shorter object length
>> 
>> The object dr.PS.tgw that you pass to plotMeanVar() doesn't appear  
>> in any
>> of the previous code that you give.  Same with meanvar.  Is it  
>> possible
>> that you've simply used the wrong object name?  It would be normal  
>> to pass
>> to plotMeanVar your data object such as d.PS or dr.PS.
>> 
>> Best wishes
>> Gordon
>> 
>>> sessionInfo()
>>> R version 2.12.0 (2010-10-15)
>>> Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)
>>> 
>>> locale:
>>> [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8
>>> 
>>> attached base packages:
>>> [1] stats     graphics  grDevices utils     datasets  methods   base
>>> 
>>> other attached packages:
>>> [1] limma_3.6.0          PGSEA_1.20.0         hgu133plus2.db_2.4.5
>>> org.Hs.eg.db_2.4.6   RSQLite_0.9-2
>>> [6] DBI_0.2-5            AnnotationDbi_1.12.0 Biobase_2.10.0
>>> MASS_7.3-8           biomaRt_2.6.0
>>> [11] edgeR_1.8.0
>>> 
>>> loaded via a namespace (and not attached):
>>> [1] affy_1.28.0           affyio_1.18.0
>>> annotate_1.28.0       Biostrings_2.18.0     gcrma_2.22.0
>>> [6] genefilter_1.32.0     IRanges_1.8.0
>>> preprocessCore_1.12.0 RCurl_1.4-3           simpleaffy_2.26.0
>>> [11] splines_2.12.0        survival_2.35-8       tools_2.12.0
>>> XML_3.2-0             xtable_1.5-6
>>>> 
>>> Mayte Suarez-Farinas
>>> Research Associate, The Rockefeller University
>>> Biostatistician, The Rockefeller University Hospital
>>> 1230 York Ave, Box 178,
>>> New York, NY, 10065
>>> +1(212) 327-8213
>> 
>> 
>> ______________________________________________________________________
>> The information in this email is confidential and inte...{{dropped:10}}
> 
> _______________________________________________
> Bioc-sig-sequencing mailing list
> [email protected]
> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing

------------------------------------------------------------------------
Davis J McCarthy
Research Technician
Bioinformatics Division
Walter and Eliza Hall Institute of Medical Research
1G Royal Parade, Parkville, Vic 3052, Australia
[email protected]
http://www.wehi.edu.au



______________________________________________________________________
The information in this email is confidential and intend...{{dropped:6}}

_______________________________________________
Bioc-sig-sequencing mailing list
[email protected]
https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing

Reply via email to