Hi Graham,
This may be the long way around the transformation, but the workflow
shared here will convert the identifiers without requiring any
programing/regular expression knowledge:
http://main.g2.bx.psu.edu/u/jen-bx-galaxy-edu/w/transform-fastq-nameidentifer
To use this:
1 - log into galaxy and switch histories to one containing this dataset
(if needed)
2 - click on the link above
3 - click on "Import workflow" at the top of the page, right of center,
next to the green "+" icon
4 - on the "Import successful" page, click on "start using this workflow"
5 - on the "Your workflows" page, click on the down arrow at the end of
"imported: Transform fastq name/identifier" to open the menu, then click
on "Run" (second choice in list). If you ever need to reach this page
again, just click on "Workflow" in the top menu bar.
6 - your history from step 1 will now display with the workflow in the
center panel.
7 - set "Step 1: Input dataset", annotated as "CASAVA 1.8+ FASTQ file",
to the FASTQ file with the identifiers like:
"@N57638:1:64JU0AAXX:1:1:1057:943 1:Y:0:"
8 - click on "Run workflow"
When run to completion, the intermediate datasets will be hidden,
leaving only the final dataset: a groomed (using quality score type
"Sanger") FASTQ file.
Hopefully this helps. Feel free to make changes, the imported copy of
the workflow is yours to modify.
Best,
Jen
Galaxy team
On 9/19/11 2:19 AM, graham etherington (TSL) wrote:
Hi,
I currently have read names with the format:
@N57638:1:64JU0AAXX:1:1:1057:943 1:Y:0:
and would like to change them to the format:
@N57638:1:64JU0AAXX:1:1:1057:943/1
I use Manipulate FASTQ, on all reads and set 'Manipulate Reads on:' to
'Name/Identifier', ('String Translate' becomes the only option).
I then set the 'From:' field to '1:Y:0:' and the 'To:' field to '/1' (without
the literal quotes).
I get the following error:
Traceback (most recent call last):
File
"/home/home/galaxy/software/galaxy-central/tools/fastq/fastq_manipulation.py",
line 37, in
main()
File
"/home/home/galaxy/software/galaxy-central/tools/fastq/fastq_manipulation.py",
line 25, in main
new_read = fastq_manipulator.match_and_manipulate_read( fastq_read )
File
"/home/home/galaxy/software/galaxy-central/database/job_working_directory/942/tmpgp13Qy",
line 15, in match_and_manipulate_read
new_read = manipulate_read( fastq_read )
File
"/home/home/galaxy/software/galaxy-central/database/job_working_directory/942/tmpgp13Qy",
line 8, in manipulate_read
new_read.identifier = "@%s" % new_read.identifier[1:].translate( maketrans( binascii.unhexlify(
"313a593a303a" ), binascii.unhexlify( "2f31" ) ) )
ValueError: maketrans arguments must have same length
So, do the From and To fields really need to be the same length?
This seems rather strange and unhelpful.
Am I doing something wrong?
Many thanks,
Graham
Dr. Graham Etherington
Bioinformatics Support Officer,
The Sainsbury Laboratory,
Norwich Research Park,
Norwich NR4 7UH.
UK
___________________________________________________________
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org. Please keep all replies on the list by
using "reply all" in your mail client. For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists,
please use the interface at:
http://lists.bx.psu.edu/
--
Jennifer Jackson
http://usegalaxy.org
http://galaxyproject.org/Support
___________________________________________________________
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org. Please keep all replies on the list by
using "reply all" in your mail client. For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists,
please use the interface at:
http://lists.bx.psu.edu/