I posted to the seqanswers forum, but have not received any feedback. I am
working with RNA-seq Illumina data files in Galaxy
(http://main.g2.bx.psu.edu/). The two files are 100bp paired-end reads,
multiplexed with barcoding to distinguish samples. The barcodes are the first
four bases of the sequences in the s_7_1_sequence.txt file.
Would the following Galaxy workflow be correct?
1. Upload both s_7_1_sequence.txt and s_7_2_sequence.txt to Galaxy with the
reference genome selected
2. Run NGS: QC and manipulation --> FASTQ Groomer on each file to convert to
3. Run NGS: QC and manipulation --> FASTQ joiner to combine the data from the
4. Run FASTX-TOOLKIT FOR FASTQ DATA --> Barcode Splitter to generate separate
FASTQ files for each barcode group
5. Run NGS: RNA Analysis --> Tophat to map the reads from each group to the
The problem I am having is that if I select paired-end for the library in
Tophat, it requests two FASTQ files. Would I have to use FASTQ Splitter to
separate the joined FASTQ files? If there is a more standard way to handle
these types of barcoded files, I would appreciate hearing about this workflow.
Thanks very much in advance,
P.S. Galaxy is an incredibly useful resource. Thanks!
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org. Please keep all replies on the list by
using "reply all" in your mail client. For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:
To manage your subscriptions to this and other Galaxy lists,
please use the interface at: