Hi all,
I was just surprised to find what I consider to be a major bug in
fasta_to_tabular_converter.py used to convert FASTA into tabular.
Consider this toy example:
>alpha
ACGTAC
>beta
AGTGTA
>gamma with some description
AGGTACCA
What the converter gives is two columns (title line and sequence),
but the '>' is left in:
>alpha (tab) ACGTAC
>beta (tab) AGTGTA
>gamma with some description (tab) AGGTACCA
Given just two columns, what I was expecting was:
alpha (tab) ACGTAC
beta (tab) AGTGTA
gamma with some description (tab) AGGTACCA
I think this is a bug. In support of this view, I note the user-facing
(now in the Tool Shed) removes the '>' symbol:
https://toolshed.g2.bx.psu.edu/view/devteam/fasta_to_tabular
https://github.com/galaxyproject/tools-devteam/tree/master/tools/fasta_to_tabular
I have submitted a pull request to address this:
https://github.com/galaxyproject/galaxy/pull/11
Note what I really wanted was three columns, the ID, comment
and sequence:
alpha (tab) (empty) (tab) ACGTAC
beta (tab) (empty) (tab) AGTGTA
gamma (tab) with some description (tab) AGGTACCA
The user-facing tool does support this. I appreciate that changing the
built-in implicit converter to give three column output could be a
problem for backward compatibility (if anyone has written a workflow
using the '>' version of the implicit conversion?), so I can make this
conversion explicit in my workflow.
Regards,
Peter
___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client. To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
https://lists.galaxyproject.org/
To search Galaxy mailing lists use the unified search at:
http://galaxyproject.org/search/mailinglists/