On Wed, Nov 20, 2013 at 7:10 PM, Dooley, Damion <damion.doo...@bccdc.ca> wrote: > I'm doing a 1 step generic reporting tool along the lines of the "BLAST > XML to tabular" script by Peter. I was just about to ask about this line, > which looked pretty much like a bug: > > sallseqid = ";".join(name.split(None,1)[0] for name in > hit_def.split(" >")) > > Then I found the patch from Nov 7th 2013: > > > https://github.com/peterjc/galaxy_blast/blob/master/tools/ncbi_blast_plus/blastxml_to_tabular.py > > try: > sallseqid = ";".join(name.split(None,1)[0] for name in > hit_def.split(" >")) > except IndexError as e: > stop_err("Problem splitting multuple hits?\n%r\n--> %s" % > (hit_def, e)) > > Yay! But what I've seen in recent XML output reports is that the ">" > content has been changed to ">" . E.g. > > > https://github.com/biopython/biopython/blob/master/Tests/Blast/mirna.xml > > <Hit> > <Hit_num>66</Hit_num> > <Hit_id>gi|195029385|ref|XR_047134.1|</Hit_id> > <Hit_def>Drosophila grimshawi miR-7-RA (Dgri\mir-7), ncRNA > >gi|195336156|ref|XR_048470.1| Drosophila sechellia miR-7-RA (Dsec\mir-7), > ncRNA >gi|195585143|ref|XR_050309.1| Drosophila simulans miR-7-RA > (Dsim\mir-7), ncRNA</Hit_def> > <Hit_accession>XR_047134</Hit_accession> > ... > > So perhaps a stop_err() could be avoided, if test is for ">" instead? > I assume that no variants of python ElementTree.iterparse() will > unescape content when returned via the iterator? > > Damion
On Wed, Nov 20, 2013 at 7:31 PM, Dooley, Damion <damion.doo...@bccdc.ca> wrote: > Woops - I realize now findtext() must be unescaping all ">", so Peter > was trying to address other non-splitting occurances of " >" as per his > patch notes. But perhaps a stop_err() isn't merrited in this case? > > So ignore my test for ">" comment. > > Regards, > > Damion OK - good. I was worried that there might be some inconsistency between different databases of versions of BLAST about how the > was encoded. As to why I treat this as a fatal error (calling stop_err), the alternative would be to issue a warning to stderr, and guess what the data ought to look like? That just seems like asking for trouble - a big red error should ensure I hear bug reports ;) Zen of Python: Errors should never pass silently. Peter ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/