Dear all,

Thank you for your prompt reply.

I wonder if it is better filter the data before do exact test and only analyse 
this few rows or analyze with all data and filter after exact test.

In my case I have a difference of 19000 rows without filter and only 300 
filtering non zero rows

The aim of the filter is to see which genes are mapped in all samples. 

Thanks in advance

Estefania

----- Mensaje original -----
De: [email protected]
Para: [email protected]
Enviados: Domingo, 28 de Agosto 2011 7:00:04
Asunto: Bioc-sig-sequencing Digest, Vol 42, Issue 11

Send Bioc-sig-sequencing mailing list submissions to
        [email protected]

To subscribe or unsubscribe via the World Wide Web, visit
        https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
or, via email, send a message with subject or body 'help' to
        [email protected]

You can reach the person managing the list at
        [email protected]

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Bioc-sig-sequencing digest..."


Today's Topics:

   1. Re: extract non-zero rows (Dario Strbenac)


----------------------------------------------------------------------

Message: 1
Date: Sun, 28 Aug 2011 17:00:22 +1000 (EST)
From: Dario Strbenac <[email protected]>
To: "Davis McCarthy" <[email protected]>
Cc: Estefania Mancini <[email protected]>,
        [email protected]
Subject: Re: [Bioc-sig-seq] extract non-zero rows
Message-ID: <[email protected]>
Content-Type: text/plain; charset=iso-8859-1

Ah, yes, that method is better. I forgot to use it my example.

- Dario.

---- Original message ----
>Date: Sat, 27 Aug 2011 13:53:16 +1000
>From: [email protected] (on behalf of Davis McCarthy 
><[email protected]>)
>Subject: Re: [Bioc-sig-seq] extract non-zero rows  
>To: [email protected]
>Cc: Estefania Mancini <[email protected]>, 
>[email protected]
>
>Estefania and Dario
>
>A more efficient way to do this:
>> row.positive.counts <- apply(dup.data$counts, 1, function(a.row) sum(a.row > 
>> 0))
>
>would be this:
>row.positive.counts <- rowSums( dup.data$counts > 0 )
>
>You might prefer to use the functions rowSums(), rowMeans(),
>colSums(), colMeans() instead of apply(), where you can. They are much
>faster.
>
>Best wishes
>Davis
>
>
>
>On 26 August 2011 10:00, Dario Strbenac <[email protected]> wrote:
>> Hi Estefania,
>>
>> If you want both columns to be non-zero, you should do
>>
>
>> filtered <- dup.data[row.positive.counts == ncol(dup.data$counts), ]
>>
>> It makes a boolean vector for each row, then sums it, because TRUE is the 
>> same as 1, so the sum gives you how many columns are greater than zero. 
>> Then, the rows that have as many positive numbers as there are columns in 
>> the data frame are kept.
>>
>> To find unchanged genes, you might do
>>
>> unchanged <- dup.de.com$table[dup.de.com$table[, "logFC"] > -0.2 & 
>> dup.de.com$table[, "logFC"] < 0.2, ]
>>
>> replacing 0.2 with what you think the biggest fold change that unchanged 
>> genes might have.
>>
>> ---- Original message ----
>>>Date: Thu, 25 Aug 2011 11:39:03 -0300 (ART)
>>>From: [email protected] (on behalf of Estefania 
>>>Mancini <[email protected]>)
>>>Subject: [Bioc-sig-seq] extract non-zero rows
>>>To: [email protected]
>>>
>>>Dear all
>>>I have loaded and analyzed properly 4 454 dataset, corresponding to control 
>>>and stress samples with their biological replicates.
>>>I would like to know if is possible to filter, in my DGEList ?object
>>>
>>>-which tags dont have zero in any column,
>>>-which of these tags could be consider "housekeeping" (at least with logFC 
>>>near 0)
>>>
>>>The object ?DGEList ?looks like this:
>>>
>>>>dup.data
>>>An object of class "DGEList"
>>>$samples
>>> ? ? ? ? ? ? group lib.size norm.factors
>>>A8_control control ? ?77953 ? ? ? ? ? ?1
>>>A8_stress ? stress ? 176860 ? ? ? ? ? ?1
>>>mq_control control ? ?98109 ? ? ? ? ? ?1
>>>mq_stress ? stress ? 145839 ? ? ? ? ? ?1
>>>pi_control control ? 132479 ? ? ? ? ? ?1
>>>pi_stress ? stress ? 142484 ? ? ? ? ? ?1
>>>tj_control control ? ?65827 ? ? ? ? ? ?1
>>>tj_stress ? stress ? 144278 ? ? ? ? ? ?1
>>>
>>>I have tried to filter using the suggested function:
>>>>dup.de.filter <- dup.data[rowSums(dup.data$counts) >= 0, ]
>>>or with
>>>>dup.de.filter <- dup.data[rowSums(dup.data$counts) >= 1, ]
>>>but have no changes at all. I have many rows which 0 and 1 read in some 
>>>column which should be excluded.
>>>
>>>Also:
>>>dup.de.com
>>>An object of class "DGEExact"
>>>$table
>>> ? ? ? ? ? ? ? ? ?logConc ? ? ? logFC ? p.value
>>>Glyma13g11940.8 -2.588833 ?0.26176050 0.7348221
>>>Glyma13g11900.1 -2.875548 ?0.03020441 0.9688072
>>>Glyma09g24780.1 -3.501041 -0.12108619 0.8754371
>>>Glyma13g12050.1 -3.224648 ?0.03036675 0.9691009
>>>Glyma13g12070.1 -3.743064 ?0.14416487 0.8521188
>>>19860 more rows ...
>>>
>>>$comparison
>>>[1] "control" "stress"
>>>$genes
>>>NULL
>>>
>>>Thanks in advance,
>>>Estefania
>>>
>>>_______________________________________________
>>>Bioc-sig-sequencing mailing list
>>>[email protected]
>>>https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
>>
>>
>> --------------------------------------
>> Dario Strbenac
>> Research Assistant
>> Cancer Epigenetics
>> Garvan Institute of Medical Research
>> Darlinghurst NSW 2010
>> Australia
>>
>> _______________________________________________
>> Bioc-sig-sequencing mailing list
>> [email protected]
>> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
>>


--------------------------------------
Dario Strbenac
Research Assistant
Cancer Epigenetics
Garvan Institute of Medical Research
Darlinghurst NSW 2010
Australia



------------------------------

_______________________________________________
Bioc-sig-sequencing mailing list
[email protected]
https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing


End of Bioc-sig-sequencing Digest, Vol 42, Issue 11

_______________________________________________
Bioc-sig-sequencing mailing list
[email protected]
https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing

Reply via email to