Re: [galaxy-dev] DustMasker tool for ncbi_blast_plus
Il giorno lun, 11/02/2013 alle 13.19 +, Peter Cock ha scritto: On Fri, Feb 8, 2013 at 4:30 PM, Nicola Soranzo sora...@crs4.it wrote: Il giorno mer, 06/02/2013 alle 20.01 +0100, Nicola Soranzo ha scritto: Hi Peter, I added these file formats mostly as placeholders for a future implementation. Now I have changed a bit the tool by removing acclist and seqloc_xml formats since they are not recognized by the last versions of dustmasker (I also sent an email to blast-h...@ncbi.nlm.nih.gov to inform them of this bug). As before, you can find the new version at: https://bitbucket.org/nsoranzo/ncbi_blast_plus I stripped the old commit and did a new one, not a very good practice, sorry about that. It seems to have confused the bitbucket page a little, but I have checked in your initial wrapper to my development repository (I use the tools branch): https://bitbucket.org/peterjc/galaxy-central/commits/2284d485e36f74f19b0dbe78709b098d9eba4ef6 Note I'm not going to include this in the Tool Shed release yet, we need to sort out the file format definitions first. Hi Peter, I implemented minimal datatypes for maskinfo ASN.1 binary and text, plus some other improvements to ncbi_blast_plus, and I sent you a pull request through Bitbucket for your development repository. I think that would be easier for you, let me know if it is not. Nicola ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] DustMasker tool for ncbi_blast_plus
On Fri, Feb 15, 2013 at 6:10 PM, Nicola Soranzo sora...@crs4.it wrote: Peter wrote: It seems to have confused the bitbucket page a little, but I have checked in your initial wrapper to my development repository (I use the tools branch): https://bitbucket.org/peterjc/galaxy-central/commits/2284d485e36f74f19b0dbe78709b098d9eba4ef6 Note I'm not going to include this in the Tool Shed release yet, we need to sort out the file format definitions first. Hi Peter, I implemented minimal datatypes for maskinfo ASN.1 binary and text, plus some other improvements to ncbi_blast_plus, and I sent you a pull request through Bitbucket for your development repository. I think that would be easier for you, let me know if it is not. Nicola That looks very useful Nicola - I hope to have time to test that next week :) Thank you, Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] DustMasker tool for ncbi_blast_plus
On Fri, Feb 8, 2013 at 4:30 PM, Nicola Soranzo sora...@crs4.it wrote: Il giorno mer, 06/02/2013 alle 20.01 +0100, Nicola Soranzo ha scritto: Hi Peter, I added these file formats mostly as placeholders for a future implementation. Now I have changed a bit the tool by removing acclist and seqloc_xml formats since they are not recognized by the last versions of dustmasker (I also sent an email to blast-h...@ncbi.nlm.nih.gov to inform them of this bug). As before, you can find the new version at: https://bitbucket.org/nsoranzo/ncbi_blast_plus I stripped the old commit and did a new one, not a very good practice, sorry about that. It seems to have confused the bitbucket page a little, but I have checked in your initial wrapper to my development repository (I use the tools branch): https://bitbucket.org/peterjc/galaxy-central/commits/2284d485e36f74f19b0dbe78709b098d9eba4ef6 Note I'm not going to include this in the Tool Shed release yet, we need to sort out the file format definitions first. Hi Peter, I've added a new commit to this repo which updates the test output files to (recommended) BLAST 2.2.26+, since functional tests were returning errors. Hope you find it useful. Also applied to my branch, thank you - I'd forgotten to update that (but intend at some point to refresh the test files and dependency install to use BLAST 2.2.27+ instead): https://bitbucket.org/peterjc/galaxy-central/commits/f1f912f63bb4174f434e3f47eac58f2cfa3753e6 Sadly I've not actually got the unit tests to run at all yet, see: http://lists.bx.psu.edu/pipermail/galaxy-dev/2013-February/013245.html Regards, Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] DustMasker tool for ncbi_blast_plus
Il giorno mer, 06/02/2013 alle 20.01 +0100, Nicola Soranzo ha scritto: Hi Peter, I added these file formats mostly as placeholders for a future implementation. Now I have changed a bit the tool by removing acclist and seqloc_xml formats since they are not recognized by the last versions of dustmasker (I also sent an email to blast-h...@ncbi.nlm.nih.gov to inform them of this bug). As before, you can find the new version at: https://bitbucket.org/nsoranzo/ncbi_blast_plus I stripped the old commit and did a new one, not a very good practice, sorry about that. Hi Peter, I've added a new commit to this repo which updates the test output files to (recommended) BLAST 2.2.26+, since functional tests were returning errors. Hope you find it useful. Nicola -- Nicola Soranzo, Ph.D. CRS4 Bioinformatics Program Loc. Piscina Manna 09010 Pula (CA), Italy http://www.bioinformatica.crs4.it/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] DustMasker tool for ncbi_blast_plus
Adding galaxy-dev list in CC as suggested by Peter. Il giorno mer, 06/02/2013 alle 16.57 +, Peter Cock ha scritto: On Tue, Feb 5, 2013 at 11:45 AM, Nicola Soranzo sora...@crs4.it wrote: Dear Peter, I have created a simple Galaxy tool for DustMasker of the NCBI BLAST+ suite, which I think would be a useful addition to the ncbi_blast_plus repository you're maintaining in the Galaxy Tool Shed. You can find it and hopefully pull it from: https://bitbucket.org/nsoranzo/ncbi_blast_plus Kind regards, Nicola Hi Nicola, Thanks for getting involved - we can discuss this on the galaxy-dev mailing list if you prefer? For now I have CC'd Edward Kirton as he is/was working on masking in BLAST databases for Galaxy. I can see the new file tools/ncbi_blast_plus/ncbi_dustmasker_wrapper.xml however it refers to multiple new file formats - where are they defined? * acclist * maskinfo_asn1_bin * maskinfo_asn1_text * seqloc_asn1_bin * seqloc_asn1_text Hi Peter, I added these file formats mostly as placeholders for a future implementation. Now I have changed a bit the tool by removing acclist and seqloc_xml formats since they are not recognized by the last versions of dustmasker (I also sent an email to blast-h...@ncbi.nlm.nih.gov to inform them of this bug). As before, you can find the new version at: https://bitbucket.org/nsoranzo/ncbi_blast_plus I stripped the old commit and did a new one, not a very good practice, sorry about that. Have you looked at the (commented out) bits in the makeblastdb wrapper which would perhaps be relevant? This is something Edward Kirton wrote which I haven't integrated yet: !-- SEQUENCE MASKING OPTIONS -- !-- TODO repeat name=mask_data title=Provide one or more files containing masking data param name=file type=data format=asnb label=File containing masking data help=As produced by NCBI masking applications (e.g. dustmasker, segmasker, windowmasker) / /repeat repeat name=gi_mask title=Create GI indexed masking data param name=file type=data format=asnb label=Masking data output file / /repeat -- Perhaps all you need to offer in ncbi_dustmasker_wrapper.xml is 'fasta' and 'asnb' (binary ASN) formats? Edward - did you have an 'asnb' definition? 'fasta' and 'interval' are the ones I'm interested for my use case. 'maskinfo_asn1_bin' is probably the one referenced as 'asnb' in the cited code (ASN1 is a general data serialization format like XML). A file in this format can be given as input to makeblastdb -mask_data. Nicola -- Nicola Soranzo, Ph.D. CRS4 Bioinformatics Program Loc. Piscina Manna 09010 Pula (CA), Italy http://www.bioinformatica.crs4.it/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/