Hi all,

I was just surprised to find what I consider to be a major bug in
fasta_to_tabular_converter.py used to convert FASTA into tabular.

Consider this toy example:

    >alpha
    ACGTAC
    >beta
    AGTGTA
    >gamma with some description
    AGGTACCA

What the converter gives is two columns (title line and sequence),
but the '>' is left in:

    >alpha (tab) ACGTAC
    >beta (tab) AGTGTA
    >gamma with some description (tab) AGGTACCA

Given just two columns, what I was expecting was:

    alpha (tab) ACGTAC
    beta (tab) AGTGTA
    gamma with some description (tab) AGGTACCA

I think this is a bug. In support of this view, I note the user-facing
(now in the Tool Shed) removes the '>' symbol:

https://toolshed.g2.bx.psu.edu/view/devteam/fasta_to_tabular
https://github.com/galaxyproject/tools-devteam/tree/master/tools/fasta_to_tabular

I have submitted a pull request to address this:

https://github.com/galaxyproject/galaxy/pull/11

Note what I really wanted was three columns, the ID, comment
and sequence:

    alpha (tab) (empty) (tab) ACGTAC
    beta (tab) (empty) (tab) AGTGTA
    gamma (tab) with some description (tab) AGGTACCA

The user-facing tool does support this. I appreciate that changing the
built-in implicit converter to give three column output could be a
problem for backward compatibility (if anyone has written a workflow
using the '>' version of the implicit conversion?), so I can make this
conversion explicit in my workflow.

Regards,

Peter
___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

Reply via email to