Re: [galaxy-user] Galaxy Question

2012-12-17 Thread Jennifer Jackson

Hello Dominique,

Would the tool 'NGS: QC and manipulation - Barcode Splitter' meet your 
needs? Please see the tool's help for usage.


Another option is to first covert the file to tabular with 'FASTA 
manipulation -FASTA-to-Tabular', then use the 'Filter and Sort - 
Filter' tool. The match criteria would look something like: c2=='ATGC' . 
Once done, convert back to fasta with 'FASTA manipulation - 
Tabular-to-FASTA'.


Hopefully one of these methods will work out for you,

Jen
Galaxy team

On 12/14/12 11:10 AM, D. A. Cowart wrote:

Hello,

I would like to use Galaxy to divide a very large Ilumnia fasta file 
(~3GB) into separate fasta files. Is this possible on Galaxy? Here is 
an example of the reads:


HWI-ST156:535:C10GLACXX:8:1101:1195:1080 1:N:0:CGGTTGT
AAATAGAATATCACATTTCACAAGCAGGACAGTGTGTGTGAAATCGTGAATTCAACGTTTATCAATTAGAACGCCTACGTGTAG

HWI-ST156:535:C10GLACXX:8:1101:1210:1102 1:N:0:CGGTTGT
ATTTATCATAACAACTTAAATCAGTCAGTGGATTTCTGTCGGTCCGGTTAGCTCGGTTGGTAAAGGCGTTTGTTCGATCGTCTGTAGCAATCGGGC

I have tried the Filter and Sort option to try and select sequences 
just by a beginning sequence (ATGC, for example) to separate these 
sequences into a specific file, but I have been unsuccessful in this.


Thank you,

Dominique


___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using reply all in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

   http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

   http://lists.bx.psu.edu/


--
Jennifer Jackson
http://galaxyproject.org

___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using reply all in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-user] Galaxy Question

2012-12-17 Thread Jennifer Jackson

Hi Dominique,

This means that there were no results. My fault - the Filter tool is the 
wrong choice. The Select tool, as you were starting with, is better for 
this case. The issues you had originally were most likely with format or 
the regular expression. So, be sure to do the following this time:


1. Converting to tabular format (choose 1 for the identifier on this 
tool's form, to keep the output in a simple two column format)


2. Use a regular expression in the Select tool, Matching, like this:

  \tATGC

Where the \t indicates tab (the tab between the two columns), 
anchoring the matching text to the start of the sequence string. You 
could be more specific with the regular expression if you need to, 
following the guidelines on the tool help, but it probably isn't 
necessary if the rest of your sequences are formatted like the examples 
below.


These sequences in your example are seperated by an empty line - but I 
am assuming that was just the way the data was pasted into the email. In 
a properly formatted fasta file, there should be no empty lines. To 
remove empty spaces in a fasta file, you can also use the Select tool 
(directly on the fasta format file), with a NOT Matching and this 
regular expression:


  ^$

Where ^ means the start of a line, and $ means the end. Together, they 
indicate a blank line. NOT Matching selects all lines that are not a 
blank line, e.g. have content.


Please give this a try and let us know how it works.

Jen
Galaxy team

On 12/17/12 8:53 AM, D. A. Cowart wrote:

Hi Jennifer,
Thank you for your reply.
I tried used the options you gave me to split files.
The second option, which would filter by basepair identity executes, 
however, the resulting job is green but empty and says no peek in 
the columns when I go to view. Do you know what this means?

Thank you,
Dominique

On Mon, Dec 17, 2012 at 7:57 AM, Jennifer Jackson j...@bx.psu.edu 
mailto:j...@bx.psu.edu wrote:


Hello Dominique,

Would the tool 'NGS: QC and manipulation - Barcode Splitter' meet
your needs? Please see the tool's help for usage.

Another option is to first covert the file to tabular with 'FASTA
manipulation -FASTA-to-Tabular', then use the 'Filter and Sort -
Filter' tool. The match criteria would look something like: 
c2=='ATGC' . Once done, convert back to fasta with 'FASTA

manipulation - Tabular-to-FASTA'.

Hopefully one of these methods will work out for you,

Jen
Galaxy team


On 12/14/12 11:10 AM, D. A. Cowart wrote:

Hello,

I would like to use Galaxy to divide a very large Ilumnia fasta
file (~3GB) into separate fasta files. Is this possible on
Galaxy? Here is an example of the reads:

HWI-ST156:535:C10GLACXX:8:1101:1195:1080 1:N:0:CGGTTGT

AAATAGAATATCACATTTCACAAGCAGGACAGTGTGTGTGAAATCGTGAATTCAACGTTTATCAATTAGAACGCCTACGTGTAG

HWI-ST156:535:C10GLACXX:8:1101:1210:1102 1:N:0:CGGTTGT

ATTTATCATAACAACTTAAATCAGTCAGTGGATTTCTGTCGGTCCGGTTAGCTCGGTTGGTAAAGGCGTTTGTTCGATCGTCTGTAGCAATCGGGC

I have tried the Filter and Sort option to try and select
sequences just by a beginning sequence (ATGC, for example) to
separate these sequences into a specific file, but I have been
unsuccessful in this.

Thank you,

Dominique


___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
atusegalaxy.org  http://usegalaxy.org.  Please keep all replies on the 
list by
using reply all in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

   http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

   http://lists.bx.psu.edu/


-- 
Jennifer Jackson

http://galaxyproject.org




--
Jennifer Jackson
http://galaxyproject.org

___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using reply all in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/