Re: [galaxy-user] Filter fastq by percentage of ambiguous (N) bases

2013-08-06 Thread Jennifer Jackson

Hello Anto,

There is no specific tool that I know of to do this based off read 
content, but you could use the very low quality score (2) assigned to 
ambiguous bases and the tool 'Filter by quality' to do a filter by 
percentage. Be aware that other bases may have scores assigned to this 
lower value, but these would very likely not be of practical usage anyway.


You could clip these end first, then do the filter, discarding any that 
have very short usable sequence left. If the data is Illumina, is likely 
a sign of a sequence that failed vendor quality checks, and these are no 
longer removed by default as of Casava 1.8+.


Creating regular expression with the Select tool is another option, but 
this probably more effort than it is worth to construct. But, your 
choice. A google will bring up syntax advice.


Ideally the first will do the job,

Jen
Galaxy team

On 7/29/13 3:17 AM, Anto Praveen Rajkumar Rajamani wrote:

Hello,

I like to filter my fastq files (50 bp single end Illumina RNA seq 
reads) by a maximum threshold (10%) of ambiguous (N) bases.
I can see that the CLIP tool removes all reads with one or more N 
bases.
Is there a way to remove only the reads with five or more N bases 
using Galaxy?

Thank you.

Best wishes,
Anto



___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using reply all in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

   http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

   http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:

   http://galaxyproject.org/search/mailinglists/


--
Jennifer Hillman-Jackson
Galaxy Support and Training
http://galaxyproject.org

___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using reply all in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:

  http://galaxyproject.org/search/mailinglists/

Re: [galaxy-user] Filter Fastq

2013-05-09 Thread Peter Cock
On Thu, May 9, 2013 at 12:27 PM, Casey,Richard
richard.ca...@colostate.edu wrote:
 Hi,

 We have two Filter FASTQ jobs running on the Galaxy public server.
 Both jobs have been running for more than four days.  This seems like
 an excessive amount of runtime.  Do Filter FASTQ jobs normally take
 this long to run?

Which FASTQ filtering tool exactly are you referring too? The one called
Filter FASTQ reads by quality score and length in the left hand column,
tool ID fastq_filter?

How big were the input files?

Peter
___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using reply all in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:

  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-user] Filter Fastq

2013-05-09 Thread Jennifer Jackson

Hi Richard,

The public Main Galaxy server has been very busy lately. To let you 
know, the size of the job will not determine when a job will start, only 
the time it was started/queued with respect to other user's jobs also 
queued and in less common circumstances the specific type of job (that 
requires a particular cluster node type - this was not the case for your 
job). How long a job executes (will be in the yellow running state) is 
related to the size of the inputs, the type of job, and parameters.


I see that these have now failed with a cluster error - please re-run 
the failed jobs one more time. If you continue to have problems, please 
submit one of the error datasets as a bug report, leaving all 
inputs/outputs undeleted.

http://wiki.galaxyproject.org/Support#Reporting_tool_errors

Very sorry that this was causing confusion. And thanks Peter for the help!

Jen
Galaxy team

On 5/9/13 4:27 AM, Casey,Richard wrote:

Hi,

We have two Filter FASTQ jobs running on the Galaxy public server.  Both jobs 
have been running for more than four days.  This seems like an excessive amount 
of runtime.  Do Filter FASTQ jobs normally take this long to run?



Richard


___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using reply all in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

   http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

   http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:

   http://galaxyproject.org/search/mailinglists/


--
Jennifer Hillman-Jackson
Galaxy Support and Training
http://galaxyproject.org

___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using reply all in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

 http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

 http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:

 http://galaxyproject.org/search/mailinglists/