Re: [galaxy-dev] Providing BLAST db in a data library
On Mon, Jul 28, 2014 at 9:43 AM, Peter Cock wrote: > On Mon, Jul 28, 2014 at 8:28 AM, Ulf Schaefer wrote: >> Dear Nate, dear Peter >> >> Sorry for the delay in replying. >> >> I can import both HTML and blastdb from a history to a data library. If >> I try to get the data out of the library into anothre history, I am >> successful for the html but not for the blastdb. The problem seems to be >> that the primary data file (the /path/dataset_12345.dat) is empty for >> the blastdb, while the html primary file has something in it. > > OK. Can you tell where Galaxy thinks the library files are on disk, > and check to see if the folder of BLAST database files is actually > there? > >> When I try to import the blastdb (from library to history) there is a >> message along the lines of "can't import empty file". I hypothesise >> (admittedly without having looked at a line of code) that there is a >> test for file size 0 somewhere that is either altogether unnecessary or, >> more likely, does not take into account that for composite datatypes it >> might be completely legitimate for the primary file to be empty. > > This guess makes sense - but I've not yet tried to trace through > the code either. > >> Or is my primary blastdb file not supposed to be empty in the first >> place? I can blast against it just fine. > > The BLAST databases do not define/populate a primary file, so > Galaxy seems to create a dummy empty file on its own. I have > wondered about altering the BLAST database datatype definition > to have a human readable text file as the "primary file" (i.e. the > information currently saved as a text log file when creating a > database). Correction: I actually implemented this late last year (included in BLAST+ wrapper version v0.0.22 onwards, and the Galaxy BLAST datatypes version v0.0.18 onwards): https://github.com/peterjc/galaxy_blast/commit/9b3f65cddcc60de26de63272c362c6ca53f6559d https://github.com/peterjc/galaxy_blast/commit/2ebfb790d5a1bbe310c3d7ccc2b953c2c37bccf2 The makeblastdb wrapper will send the stdout (log information) to the dummy index file, see the end of the tag in: https://github.com/peterjc/galaxy_blast/blob/master/tools/ncbi_blast_plus/ncbi_makeblastdb.xml The display_data method for a BLAST database will show any makeblastdb log information held in the dummy index file, see https://github.com/peterjc/galaxy_blast/blob/master/datatypes/blast_datatypes/blast.py i.e. Only older BLAST databases in histories should have empty dummy index files, which will mitigate the library problem: https://trello.com/c/bNEKfOWR Peter ___ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] Providing BLAST db in a data library
Thanks for tracking down the problem - it sounds like it is a Galaxy bug then so I have created a Trello card (https://trello.com/c/bNEKfOWR). -John On Wed, Jul 30, 2014 at 7:06 AM, Peter Cock wrote: > On Wed, Jul 30, 2014 at 11:52 AM, Ulf Schaefer > wrote: >> Dear Nate, dear Peter >> >> Again, sorry for the delay in replying. >> >> Yes I can. It looks like this >> >> [galaxy@srv ~]$ cat /galaxy/database/files/081/dataset_81002.dat >> [galaxy@srv ~]$ ls /galaxy/database/files/081/dataset_81002_files/ >> blastdb.nhd blastdb.nhi blastdb.nhr blastdb.nin blastdb.nog >> blastdb.nsd blastdb.nsi blastdb.nsq > > Good. Thanks for confirming that. > >> I think the simplest solution would be to put something in the primary >> file. Just a short string that gets the file size above 0. > > That won't help with all the existing datasets out there - I think we > rather need to fix something in the Galaxy code for composite files... > >> I personally have followed you initial suggestion and made the dbs >> available globally via the .loc file. >> >> Thanks again >> Ulf > > Great. > > Peter > ___ > Please keep all replies on the list by using "reply all" > in your mail client. To manage your subscriptions to this > and other Galaxy lists, please use the interface at: > http://lists.bx.psu.edu/ > > To search Galaxy mailing lists use the unified search at: > http://galaxyproject.org/search/mailinglists/ ___ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] Providing BLAST db in a data library
On Wed, Jul 30, 2014 at 11:52 AM, Ulf Schaefer wrote: > Dear Nate, dear Peter > > Again, sorry for the delay in replying. > > Yes I can. It looks like this > > [galaxy@srv ~]$ cat /galaxy/database/files/081/dataset_81002.dat > [galaxy@srv ~]$ ls /galaxy/database/files/081/dataset_81002_files/ > blastdb.nhd blastdb.nhi blastdb.nhr blastdb.nin blastdb.nog > blastdb.nsd blastdb.nsi blastdb.nsq Good. Thanks for confirming that. > I think the simplest solution would be to put something in the primary > file. Just a short string that gets the file size above 0. That won't help with all the existing datasets out there - I think we rather need to fix something in the Galaxy code for composite files... > I personally have followed you initial suggestion and made the dbs > available globally via the .loc file. > > Thanks again > Ulf Great. Peter ___ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] Providing BLAST db in a data library
Dear Nate, dear Peter Again, sorry for the delay in replying. Yes I can. It looks like this [galaxy@srv ~]$ cat /galaxy/database/files/081/dataset_81002.dat [galaxy@srv ~]$ ls /galaxy/database/files/081/dataset_81002_files/ blastdb.nhd blastdb.nhi blastdb.nhr blastdb.nin blastdb.nog blastdb.nsd blastdb.nsi blastdb.nsq I think the simplest solution would be to put something in the primary file. Just a short string that gets the file size above 0. I personally have followed you initial suggestion and made the dbs available globally via the .loc file. Thanks again Ulf On 28/07/14 09:43, Peter Cock wrote: > On Mon, Jul 28, 2014 at 8:28 AM, Ulf Schaefer wrote: >> Dear Nate, dear Peter >> >> Sorry for the delay in replying. >> >> I can import both HTML and blastdb from a history to a data library. If >> I try to get the data out of the library into anothre history, I am >> successful for the html but not for the blastdb. The problem seems to be >> that the primary data file (the /path/dataset_12345.dat) is empty for >> the blastdb, while the html primary file has something in it. > > OK. Can you tell where Galaxy thinks the library files are on disk, > and check to see if the folder of BLAST database files is actually > there? > >> When I try to import the blastdb (from library to history) there is a >> message along the lines of "can't import empty file". I hypothesise >> (admittedly without having looked at a line of code) that there is a >> test for file size 0 somewhere that is either altogether unnecessary or, >> more likely, does not take into account that for composite datatypes it >> might be completely legitimate for the primary file to be empty. > > This guess makes sense - but I've not yet tried to trace through > the code either. > >> Or is my primary blastdb file not supposed to be empty in the first >> place? I can blast against it just fine. > > The BLAST databases do not define/populate a primary file, so > Galaxy seems to create a dummy empty file on its own. I have > wondered about altering the BLAST database datatype definition > to have a human readable text file as the "primary file" (i.e. the > information currently saved as a text log file when creating a > database). > >> Thanks a lot for your help >> Ulf > > You too - you've found an "interesting" bug... > > Peter > ** The information contained in the EMail and any attachments is confidential and intended solely and for the attention and use of the named addressee(s). It may not be disclosed to any other person without the express authority of Public Health England, or the intended recipient, or both. If you are not the intended recipient, you must not disclose, copy, distribute or retain this message or any part of it. This footnote also confirms that this EMail has been swept for computer viruses by Symantec.Cloud, but please re-sweep any attachments before opening or saving. http://www.gov.uk/PHE ** ___ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] Providing BLAST db in a data library
On Mon, Jul 28, 2014 at 8:28 AM, Ulf Schaefer wrote: > Dear Nate, dear Peter > > Sorry for the delay in replying. > > I can import both HTML and blastdb from a history to a data library. If > I try to get the data out of the library into anothre history, I am > successful for the html but not for the blastdb. The problem seems to be > that the primary data file (the /path/dataset_12345.dat) is empty for > the blastdb, while the html primary file has something in it. OK. Can you tell where Galaxy thinks the library files are on disk, and check to see if the folder of BLAST database files is actually there? > When I try to import the blastdb (from library to history) there is a > message along the lines of "can't import empty file". I hypothesise > (admittedly without having looked at a line of code) that there is a > test for file size 0 somewhere that is either altogether unnecessary or, > more likely, does not take into account that for composite datatypes it > might be completely legitimate for the primary file to be empty. This guess makes sense - but I've not yet tried to trace through the code either. > Or is my primary blastdb file not supposed to be empty in the first > place? I can blast against it just fine. The BLAST databases do not define/populate a primary file, so Galaxy seems to create a dummy empty file on its own. I have wondered about altering the BLAST database datatype definition to have a human readable text file as the "primary file" (i.e. the information currently saved as a text log file when creating a database). > Thanks a lot for your help > Ulf You too - you've found an "interesting" bug... Peter ___ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] Providing BLAST db in a data library
Dear Nate, dear Peter Sorry for the delay in replying. I can import both HTML and blastdb from a history to a data library. If I try to get the data out of the library into anothre history, I am successful for the html but not for the blastdb. The problem seems to be that the primary data file (the /path/dataset_12345.dat) is empty for the blastdb, while the html primary file has something in it. When I try to import the blastdb (from library to history) there is a message along the lines of "can't import empty file". I hypothesise (admittedly without having looked at a line of code) that there is a test for file size 0 somewhere that is either altogether unnecessary or, more likely, does not take into account that for composite datatypes it might be completely legitimate for the primary file to be empty. Or is my primary blastdb file not supposed to be empty in the first place? I can blast against it just fine. Thanks a lot for your help Ulf On 24/07/14 15:02, Peter Cock wrote: > On Thu, Jul 24, 2014 at 2:50 PM, Nate Coraor wrote: >> On Jul 23, 2014, at 6:42 AM, Peter Cock wrote: >> >>> Interesting hypothesis - you may well be right. >>> >>> Galaxy guys - who is the expert to talk to on this and/or where >>> in the code should we be looking? >>> >>> Thanks, >>> >>> Peter >> >> I think there's a bit of a mixup here - Peter, I believe you were asking >> if other composite types with an html primary dataset could be imported >> from the history to library, but Ulf, your test was the other direction >> (library->history). I'd be interested in knowing the outcome of the >> history->library test as well. > > Good catch - yes, that was what I was asking about. Ulf? > >> I am woefully ignorant about the blastdbn datatype. Is the primary >> file supposed to be html type but empty? > > The BLAST databases are 'basic' composite datatypes, of which > the most commonly used example is HTML (and some bits of > the base class code code seem to assume HTML). This means > testing if something works with HTML is a good first step. > > https://github.com/peterjc/galaxy_blast/tree/master/datatypes/blast_datatypes > > Peter > ** The information contained in the EMail and any attachments is confidential and intended solely and for the attention and use of the named addressee(s). It may not be disclosed to any other person without the express authority of Public Health England, or the intended recipient, or both. If you are not the intended recipient, you must not disclose, copy, distribute or retain this message or any part of it. This footnote also confirms that this EMail has been swept for computer viruses by Symantec.Cloud, but please re-sweep any attachments before opening or saving. http://www.gov.uk/PHE ** ___ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] Providing BLAST db in a data library
On Thu, Jul 24, 2014 at 2:50 PM, Nate Coraor wrote: > On Jul 23, 2014, at 6:42 AM, Peter Cock wrote: > >> Interesting hypothesis - you may well be right. >> >> Galaxy guys - who is the expert to talk to on this and/or where >> in the code should we be looking? >> >> Thanks, >> >> Peter > > I think there's a bit of a mixup here - Peter, I believe you were asking > if other composite types with an html primary dataset could be imported > from the history to library, but Ulf, your test was the other direction > (library->history). I'd be interested in knowing the outcome of the > history->library test as well. Good catch - yes, that was what I was asking about. Ulf? > I am woefully ignorant about the blastdbn datatype. Is the primary > file supposed to be html type but empty? The BLAST databases are 'basic' composite datatypes, of which the most commonly used example is HTML (and some bits of the base class code code seem to assume HTML). This means testing if something works with HTML is a good first step. https://github.com/peterjc/galaxy_blast/tree/master/datatypes/blast_datatypes Peter ___ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] Providing BLAST db in a data library
On Jul 23, 2014, at 6:42 AM, Peter Cock wrote: > Interesting hypothesis - you may well be right. > > Galaxy guys - who is the expert to talk to on this and/or where > in the code should we be looking? > > Thanks, > > Peter I think there's a bit of a mixup here - Peter, I believe you were asking if other composite types with an html primary dataset could be imported from the history to library, but Ulf, your test was the other direction (library->history). I'd be interested in knowing the outcome of the history->library test as well. I am woefully ignorant about the blastdbn datatype. Is the primary file supposed to be html type but empty? --nate > > On Wed, Jul 23, 2014 at 11:22 AM, Ulf Schaefer > wrote: >> Dear Peter >> >> Thanks for your reply. >> >> I can import an html report (e.g. FastQC output) successfully into a new >> history from a data library. But the .dat file for the html is not empty >> like the one for the blastdb. Makes me think that I could do this with a >> blast db as well, if only it would not check for size 0 at the time of >> importing it. >> >> Thanks >> Ulf > ___ > Please keep all replies on the list by using "reply all" > in your mail client. To manage your subscriptions to this > and other Galaxy lists, please use the interface at: > http://lists.bx.psu.edu/ > > To search Galaxy mailing lists use the unified search at: > http://galaxyproject.org/search/mailinglists/ ___ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] Providing BLAST db in a data library
Interesting hypothesis - you may well be right. Galaxy guys - who is the expert to talk to on this and/or where in the code should we be looking? Thanks, Peter On Wed, Jul 23, 2014 at 11:22 AM, Ulf Schaefer wrote: > Dear Peter > > Thanks for your reply. > > I can import an html report (e.g. FastQC output) successfully into a new > history from a data library. But the .dat file for the html is not empty > like the one for the blastdb. Makes me think that I could do this with a > blast db as well, if only it would not check for size 0 at the time of > importing it. > > Thanks > Ulf ___ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] Providing BLAST db in a data library
Dear Peter Thanks for your reply. I can import an html report (e.g. FastQC output) successfully into a new history from a data library. But the .dat file for the html is not empty like the one for the blastdb. Makes me think that I could do this with a blast db as well, if only it would not check for size 0 at the time of importing it. Thanks Ulf On 23/07/14 10:56, Peter Cock wrote: > On Wed, Jul 23, 2014 at 10:47 AM, Ulf Schaefer > wrote: >> Dear all >> >> I have several smallish BLAST databases that I would like to provide in >> a data library. I create them in a history with the makeblastdb tool and >> them try to add them to the library. I see that for each blast db there >> is an empty file created (like /path/dataset_12345.dat) and a folder >> with the same name (/path/dataset_12345_files/) that contains the actual >> db files (blastdb.n*). >> >> In my library the blastdb shows up empty and I cannot import it back to >> another history. I does not seem to be aware of the _files folder, >> despite it being the right data type (blastdbn). >> >> Any ideas what I am doing wrong? >> >> Thanks a lot for your help >> Ulf > > Hi Ulf, > > I've never tried that. It could be a bug in Galaxy importing > composite datatypes into a library, or something in the BLAST > database definition which needs fixing. Does importing an > HTML report (with child files like images) into a library work > for you? (This is another composite datatype so a useful > comparison). > > Rather than using Data Libraries, we just list all the locally > installed shared BLAST databases via the BLAST *.loc > files instead. > > Note using the *.loc files makes the databases available to > all the Galaxy users, while with a Data Library you can > control access to specific groups/roles. > > Regards, > > Peter > ** The information contained in the EMail and any attachments is confidential and intended solely and for the attention and use of the named addressee(s). It may not be disclosed to any other person without the express authority of Public Health England, or the intended recipient, or both. If you are not the intended recipient, you must not disclose, copy, distribute or retain this message or any part of it. This footnote also confirms that this EMail has been swept for computer viruses by Symantec.Cloud, but please re-sweep any attachments before opening or saving. http://www.gov.uk/PHE ** ___ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] Providing BLAST db in a data library
On Wed, Jul 23, 2014 at 10:47 AM, Ulf Schaefer wrote: > Dear all > > I have several smallish BLAST databases that I would like to provide in > a data library. I create them in a history with the makeblastdb tool and > them try to add them to the library. I see that for each blast db there > is an empty file created (like /path/dataset_12345.dat) and a folder > with the same name (/path/dataset_12345_files/) that contains the actual > db files (blastdb.n*). > > In my library the blastdb shows up empty and I cannot import it back to > another history. I does not seem to be aware of the _files folder, > despite it being the right data type (blastdbn). > > Any ideas what I am doing wrong? > > Thanks a lot for your help > Ulf Hi Ulf, I've never tried that. It could be a bug in Galaxy importing composite datatypes into a library, or something in the BLAST database definition which needs fixing. Does importing an HTML report (with child files like images) into a library work for you? (This is another composite datatype so a useful comparison). Rather than using Data Libraries, we just list all the locally installed shared BLAST databases via the BLAST *.loc files instead. Note using the *.loc files makes the databases available to all the Galaxy users, while with a Data Library you can control access to specific groups/roles. Regards, Peter ___ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/