Dear Jane,

in the case of genes (or better term: loci) with only one exon, DEXSeq's testGeneForDEU does not make sense - it is looking for exons within the locus that behave differently from the average of all the others: there are no others! Hence the parameters are unidentifiable, and the software should not even look at them. We'll robustify the software for this case in future versions (note that the package is still actively being developed).

The case of the NA values that are not explained by there only being one exon is more interesting. Can you send us the code to reproduce your problem (this also requires the data, which you can send offline, and in which you can for instance anonymize the gene and sample annotations).

        Best wishes
        Wolfgang

Jul/28/11 4:23 PM, Jane Merlevede scripsit::
Hello,

I am using DEXseq for exon differential analysis. The analysis seems to work
well on genes with several exons, but I have a problem with genes with only
one exon.
Let me present you my dataset:

pData(ecs)
         sizeFactor condition replicate       type
Raman_1  0.9638822    nonvir         1 paired-end
HM1_1    1.4000715       vir         1 paired-end
Raman_2  1.0314049    nonvir         2 paired-end
Raman_3  1.0734133    nonvir         3 paired-end
HM1_2    0.7801269       vir         2 paired-end
HM1_3    0.9488409       vir         3 paired-end


table( table (geneIDs(ecs)))
    1    2    3    4    5    6    7    8
3149  893  874  206  100   42   18    4

As you can see, I have 3149 genes (out of 5286) which have only 1 exon.
There are 9291 exons.

res1<- DEUresultTable(ecs)

summary(res1)
                      geneID         exonID     dispersion_CR_est
  EHI_051420             :   8   E001   :5286   Min.   :0.000e+00
  EHI_070690             :   8   E002   :2137   1st Qu.:7.176e-03
  EHI_169670             :   8   E003   :1244   Median :1.693e-02
  EHI_175010+EHI_175000.1:   8   E004   : 370   Mean   :5.546e-02
  EHI_005010             :   7   E005   : 164   3rd Qu.:3.641e-02
  EHI_042710             :   7   E006   :  64   Max.   :1.300e+01
  (Other)                :9245   (Other):  26   NA's   :3.153e+03
    dispersion            pvalue             padjust
  Min.   :2.010e-02   Min.   :   0.0000   Min.   :   0.0000
  1st Qu.:2.202e-02   1st Qu.:   0.1150   1st Qu.:   0.4597
  Median :2.628e-02   Median :   0.4054   Median :   0.8086
  Mean   :3.229e+04   Mean   :   0.4269   Mean   :   0.6775
  3rd Qu.:4.214e-02   3rd Qu.:   0.7133   3rd Qu.:   0.9510
  Max.   :1.000e+08   Max.   :   1.0000   Max.   :   1.0000
                      NA's   :3153.0000   NA's   :3153.0000

There are 3153 NA's values. When checking out what were those NA's values, I
realized that they were all the genes with only one exon (and only 4 more
genes in the other cases). I doubt that it is by chance... And they don't
have weird read counts. Check for example EHI_000010, EHI_000410,
EHI_000420, EHI_000430, EHI_000440:

geneCountTable(ecs)[1:21,]
              Raman_1 HM1_1 Raman_2 Raman_3 HM1_2 HM1_3
EHI_000010       391  1364     464     449   457   560
EHI_000130     18806  8116   21890   24994  6509 14067
EHI_000240     24057 21825   15365   26903 21101 27841
EHI_000250      3374  2689    3938    4854  2310  3556
EHI_000290       458  3721     437     550  1716  1054
EHI_000410       896  1041    1086     978   905   745
EHI_000420       140   220     244     117   140    78
EHI_000430       583  2043     304     634  1398  1197
EHI_000440      3103 15309    5358    3619  9152 10392
EHI_000450       358   559     991     516   281   564
EHI_000460.1     357   118     449     425    43   746
EHI_000470.1     408  2657     297     397   946   909
EHI_000480       842  1702     858     778   841   847
EHI_000490       187   648     221     239   370   249
EHI_000500       329   506     454     402   244   555
EHI_000520       482  1615     883     440   925   751
EHI_000530      1794  1588    1751    1966  1099  1860
EHI_000540      2857  2504    3706    4152  1479  2547
EHI_000550      1967  3999    1241    1904  2505  2448
EHI_000560       152   434     239     165   273   175
EHI_000570      1209  2142    2183    1213  1905  2662


res1[1:20,]
                        geneID exonID dispersion_CR_est dispersion     pvalue
EHI_000010:001     EHI_000010   E001                NA 0.02377712         NA
EHI_000130:001     EHI_000130   E001       0.074669451 0.07466945 0.48673539
EHI_000130:002     EHI_000130   E002       0.128972813 0.12897281 0.94279527
EHI_000130:003     EHI_000130   E003       0.019602219 0.02031871 0.84327914
EHI_000130:004     EHI_000130   E004       0.067337196 0.06733720 0.70429765
EHI_000240:001     EHI_000240   E001       0.043753997 0.04375400 0.74939865
EHI_000240:002     EHI_000240   E002       0.058562700 0.05856270 0.74939865
EHI_000250:001     EHI_000250   E001       0.031440630 0.03144063 0.70444841
EHI_000250:002     EHI_000250   E002       0.037457450 0.03745745 0.70444841
EHI_000290:001     EHI_000290   E001       0.021113888 0.02396819 0.05100089
EHI_000290:002     EHI_000290   E002       0.007626346 0.03471905 0.95676220
EHI_000290:003     EHI_000290   E003       0.036889359 0.03688936 0.03057067
EHI_000410:001     EHI_000410   E001                NA 0.02235347         NA
EHI_000420:001     EHI_000420   E001                NA 0.03395573         NA
EHI_000430:001     EHI_000430   E001                NA 0.02219522         NA
EHI_000440:001     EHI_000440   E001                NA 0.02037263         NA
EHI_000450:001     EHI_000450   E001       0.024495906 0.04492663 0.64717121
EHI_000450:002     EHI_000450   E002       0.020356260 0.02483641 0.64717121
EHI_000460.1:001 EHI_000460.1   E001                NA 0.02602179         NA
EHI_000470.1:001 EHI_000470.1   E001                NA 0.02254333         NA
                    padjust
EHI_000010:001          NA
EHI_000130:001   0.8639624
EHI_000130:002   0.9927663
EHI_000130:003   0.9756197
EHI_000130:004   0.9503086
EHI_000240:001   0.9611397
EHI_000240:002   0.9611397
EHI_000250:001   0.9503086
EHI_000250:002   0.9503086
EHI_000290:001   0.2890521
EHI_000290:002   0.9962043
EHI_000290:003   0.2087239
EHI_000410:001          NA
EHI_000420:001          NA
EHI_000430:001          NA
EHI_000440:001          NA
EHI_000450:001   0.9278993
EHI_000450:002   0.9278993
EHI_000460.1:001        NA
EHI_000470.1:001        NA

I checked the model in the documentation and I don't see why the test does
not give a result in this case. Has anyone get the same problem?

Thanks for your help,
Jane Merlevède

        [[alternative HTML version deleted]]




_______________________________________________
Bioc-sig-sequencing mailing list
Bioc-sig-sequencing@r-project.org
https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing


--


Wolfgang Huber
EMBL
http://www.embl.de/research/units/genome_biology/huber

_______________________________________________
Bioc-sig-sequencing mailing list
Bioc-sig-sequencing@r-project.org
https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing

Reply via email to