Estefania and Dario

A more efficient way to do this:
> row.positive.counts <- apply(dup.data$counts, 1, function(a.row) sum(a.row > 
> 0))

would be this:
row.positive.counts <- rowSums( dup.data$counts > 0 )

You might prefer to use the functions rowSums(), rowMeans(),
colSums(), colMeans() instead of apply(), where you can. They are much
faster.

Best wishes
Davis



On 26 August 2011 10:00, Dario Strbenac <d.strbe...@garvan.org.au> wrote:
> Hi Estefania,
>
> If you want both columns to be non-zero, you should do
>

> filtered <- dup.data[row.positive.counts == ncol(dup.data$counts), ]
>
> It makes a boolean vector for each row, then sums it, because TRUE is the 
> same as 1, so the sum gives you how many columns are greater than zero. Then, 
> the rows that have as many positive numbers as there are columns in the data 
> frame are kept.
>
> To find unchanged genes, you might do
>
> unchanged <- dup.de.com$table[dup.de.com$table[, "logFC"] > -0.2 & 
> dup.de.com$table[, "logFC"] < 0.2, ]
>
> replacing 0.2 with what you think the biggest fold change that unchanged 
> genes might have.
>
> ---- Original message ----
>>Date: Thu, 25 Aug 2011 11:39:03 -0300 (ART)
>>From: bioc-sig-sequencing-boun...@r-project.org (on behalf of Estefania 
>>Mancini <estefania.manc...@indear.com>)
>>Subject: [Bioc-sig-seq] extract non-zero rows
>>To: bioc-sig-sequencing@r-project.org
>>
>>Dear all
>>I have loaded and analyzed properly 4 454 dataset, corresponding to control 
>>and stress samples with their biological replicates.
>>I would like to know if is possible to filter, in my DGEList  object
>>
>>-which tags dont have zero in any column,
>>-which of these tags could be consider "housekeeping" (at least with logFC 
>>near 0)
>>
>>The object  DGEList  looks like this:
>>
>>>dup.data
>>An object of class "DGEList"
>>$samples
>>             group lib.size norm.factors
>>A8_control control    77953            1
>>A8_stress   stress   176860            1
>>mq_control control    98109            1
>>mq_stress   stress   145839            1
>>pi_control control   132479            1
>>pi_stress   stress   142484            1
>>tj_control control    65827            1
>>tj_stress   stress   144278            1
>>
>>I have tried to filter using the suggested function:
>>>dup.de.filter <- dup.data[rowSums(dup.data$counts) >= 0, ]
>>or with
>>>dup.de.filter <- dup.data[rowSums(dup.data$counts) >= 1, ]
>>but have no changes at all. I have many rows which 0 and 1 read in some 
>>column which should be excluded.
>>
>>Also:
>>dup.de.com
>>An object of class "DGEExact"
>>$table
>>                  logConc       logFC   p.value
>>Glyma13g11940.8 -2.588833  0.26176050 0.7348221
>>Glyma13g11900.1 -2.875548  0.03020441 0.9688072
>>Glyma09g24780.1 -3.501041 -0.12108619 0.8754371
>>Glyma13g12050.1 -3.224648  0.03036675 0.9691009
>>Glyma13g12070.1 -3.743064  0.14416487 0.8521188
>>19860 more rows ...
>>
>>$comparison
>>[1] "control" "stress"
>>$genes
>>NULL
>>
>>Thanks in advance,
>>Estefania
>>
>>_______________________________________________
>>Bioc-sig-sequencing mailing list
>>Bioc-sig-sequencing@r-project.org
>>https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
>
>
> --------------------------------------
> Dario Strbenac
> Research Assistant
> Cancer Epigenetics
> Garvan Institute of Medical Research
> Darlinghurst NSW 2010
> Australia
>
> _______________________________________________
> Bioc-sig-sequencing mailing list
> Bioc-sig-sequencing@r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
>

_______________________________________________
Bioc-sig-sequencing mailing list
Bioc-sig-sequencing@r-project.org
https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing

Reply via email to