Re: [galaxy-dev] Providing BLAST db in a data library

2014-08-14 Thread Peter Cock
On Mon, Jul 28, 2014 at 9:43 AM, Peter Cock p.j.a.c...@googlemail.com wrote:
 On Mon, Jul 28, 2014 at 8:28 AM, Ulf Schaefer ulf.schae...@phe.gov.uk wrote:
 Dear Nate, dear Peter

 Sorry for the delay in replying.

 I can import both HTML and blastdb from a history to a data library. If
 I try to get the data out of the library into anothre history, I am
 successful for the html but not for the blastdb. The problem seems to be
 that the primary data file (the /path/dataset_12345.dat) is empty for
 the blastdb, while the html primary file has something in it.

 OK. Can you tell where Galaxy thinks the library files are on disk,
 and check to see if the folder of BLAST database files is actually
 there?

 When I try to import the blastdb (from library to history) there is a
 message along the lines of can't import empty file. I hypothesise
 (admittedly without having looked at a line of code) that there is a
 test for file size 0 somewhere that is either altogether unnecessary or,
 more likely, does not take into account that for composite datatypes it
 might be completely legitimate for the primary file to be empty.

 This guess makes sense - but I've not yet tried to trace through
 the code either.

 Or is my primary blastdb file not supposed to be empty in the first
 place? I can blast against it just fine.

 The BLAST databases do not define/populate a primary file, so
 Galaxy seems to create a dummy empty file on its own. I have
 wondered about altering the BLAST database datatype definition
 to have a human readable text file as the primary file (i.e. the
 information currently saved as a text log file when creating a
 database).

Correction: I actually implemented this late last year (included in
BLAST+ wrapper version v0.0.22 onwards, and the Galaxy
BLAST datatypes version v0.0.18 onwards):

https://github.com/peterjc/galaxy_blast/commit/9b3f65cddcc60de26de63272c362c6ca53f6559d
https://github.com/peterjc/galaxy_blast/commit/2ebfb790d5a1bbe310c3d7ccc2b953c2c37bccf2

The makeblastdb wrapper will send the stdout (log information)
to the dummy index file, see the end of the command tag in:
https://github.com/peterjc/galaxy_blast/blob/master/tools/ncbi_blast_plus/ncbi_makeblastdb.xml

The display_data method for a BLAST database will show any
makeblastdb log information held in the dummy index file, see
https://github.com/peterjc/galaxy_blast/blob/master/datatypes/blast_datatypes/blast.py

i.e. Only older BLAST databases in histories should have empty
dummy index files, which will mitigate the library problem:
https://trello.com/c/bNEKfOWR

Peter
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] Providing BLAST db in a data library

2014-07-30 Thread Ulf Schaefer
Dear Nate, dear Peter

Again, sorry for the delay in replying.

Yes I can. It looks like this

[galaxy@srv ~]$ cat /galaxy/database/files/081/dataset_81002.dat
[galaxy@srv ~]$ ls /galaxy/database/files/081/dataset_81002_files/
blastdb.nhd  blastdb.nhi  blastdb.nhr  blastdb.nin  blastdb.nog 
blastdb.nsd  blastdb.nsi  blastdb.nsq

I think the simplest solution would be to put something in the primary 
file. Just a short string that gets the file size above 0.

I personally have followed you initial suggestion and made the dbs 
available globally via the .loc file.

Thanks again
Ulf


On 28/07/14 09:43, Peter Cock wrote:
 On Mon, Jul 28, 2014 at 8:28 AM, Ulf Schaefer ulf.schae...@phe.gov.uk wrote:
 Dear Nate, dear Peter

 Sorry for the delay in replying.

 I can import both HTML and blastdb from a history to a data library. If
 I try to get the data out of the library into anothre history, I am
 successful for the html but not for the blastdb. The problem seems to be
 that the primary data file (the /path/dataset_12345.dat) is empty for
 the blastdb, while the html primary file has something in it.

 OK. Can you tell where Galaxy thinks the library files are on disk,
 and check to see if the folder of BLAST database files is actually
 there?

 When I try to import the blastdb (from library to history) there is a
 message along the lines of can't import empty file. I hypothesise
 (admittedly without having looked at a line of code) that there is a
 test for file size 0 somewhere that is either altogether unnecessary or,
 more likely, does not take into account that for composite datatypes it
 might be completely legitimate for the primary file to be empty.

 This guess makes sense - but I've not yet tried to trace through
 the code either.

 Or is my primary blastdb file not supposed to be empty in the first
 place? I can blast against it just fine.

 The BLAST databases do not define/populate a primary file, so
 Galaxy seems to create a dummy empty file on its own. I have
 wondered about altering the BLAST database datatype definition
 to have a human readable text file as the primary file (i.e. the
 information currently saved as a text log file when creating a
 database).

 Thanks a lot for your help
 Ulf

 You too - you've found an interesting bug...

 Peter


**
The information contained in the EMail and any attachments is confidential and 
intended solely and for the attention and use of the named addressee(s). It may 
not be disclosed to any other person without the express authority of Public 
Health England, or the intended recipient, or both. If you are not the intended 
recipient, you must not disclose, copy, distribute or retain this message or 
any part of it. This footnote also confirms that this EMail has been swept for 
computer viruses by Symantec.Cloud, but please re-sweep any attachments before 
opening or saving. http://www.gov.uk/PHE
**

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] Providing BLAST db in a data library

2014-07-30 Thread Peter Cock
On Wed, Jul 30, 2014 at 11:52 AM, Ulf Schaefer ulf.schae...@phe.gov.uk wrote:
 Dear Nate, dear Peter

 Again, sorry for the delay in replying.

 Yes I can. It looks like this

 [galaxy@srv ~]$ cat /galaxy/database/files/081/dataset_81002.dat
 [galaxy@srv ~]$ ls /galaxy/database/files/081/dataset_81002_files/
 blastdb.nhd  blastdb.nhi  blastdb.nhr  blastdb.nin  blastdb.nog
 blastdb.nsd  blastdb.nsi  blastdb.nsq

Good. Thanks for confirming that.

 I think the simplest solution would be to put something in the primary
 file. Just a short string that gets the file size above 0.

That won't help with all the existing datasets out there - I think we
rather need to fix something in the Galaxy code for composite files...

 I personally have followed you initial suggestion and made the dbs
 available globally via the .loc file.

 Thanks again
 Ulf

Great.

Peter
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] Providing BLAST db in a data library

2014-07-30 Thread John Chilton
Thanks for tracking down the problem - it sounds like it is a Galaxy
bug then so I have created a Trello card
(https://trello.com/c/bNEKfOWR).

-John

On Wed, Jul 30, 2014 at 7:06 AM, Peter Cock p.j.a.c...@googlemail.com wrote:
 On Wed, Jul 30, 2014 at 11:52 AM, Ulf Schaefer ulf.schae...@phe.gov.uk 
 wrote:
 Dear Nate, dear Peter

 Again, sorry for the delay in replying.

 Yes I can. It looks like this

 [galaxy@srv ~]$ cat /galaxy/database/files/081/dataset_81002.dat
 [galaxy@srv ~]$ ls /galaxy/database/files/081/dataset_81002_files/
 blastdb.nhd  blastdb.nhi  blastdb.nhr  blastdb.nin  blastdb.nog
 blastdb.nsd  blastdb.nsi  blastdb.nsq

 Good. Thanks for confirming that.

 I think the simplest solution would be to put something in the primary
 file. Just a short string that gets the file size above 0.

 That won't help with all the existing datasets out there - I think we
 rather need to fix something in the Galaxy code for composite files...

 I personally have followed you initial suggestion and made the dbs
 available globally via the .loc file.

 Thanks again
 Ulf

 Great.

 Peter
 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:
   http://lists.bx.psu.edu/

 To search Galaxy mailing lists use the unified search at:
   http://galaxyproject.org/search/mailinglists/
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] Providing BLAST db in a data library

2014-07-28 Thread Ulf Schaefer
Dear Nate, dear Peter

Sorry for the delay in replying.

I can import both HTML and blastdb from a history to a data library. If 
I try to get the data out of the library into anothre history, I am 
successful for the html but not for the blastdb. The problem seems to be 
that the primary data file (the /path/dataset_12345.dat) is empty for 
the blastdb, while the html primary file has something in it.

When I try to import the blastdb (from library to history) there is a 
message along the lines of can't import empty file. I hypothesise 
(admittedly without having looked at a line of code) that there is a 
test for file size 0 somewhere that is either altogether unnecessary or, 
more likely, does not take into account that for composite datatypes it 
might be completely legitimate for the primary file to be empty.

Or is my primary blastdb file not supposed to be empty in the first 
place? I can blast against it just fine.

Thanks a lot for your help
Ulf

On 24/07/14 15:02, Peter Cock wrote:
 On Thu, Jul 24, 2014 at 2:50 PM, Nate Coraor n...@bx.psu.edu wrote:
 On Jul 23, 2014, at 6:42 AM, Peter Cock p.j.a.c...@googlemail.com wrote:

 Interesting hypothesis - you may well be right.

 Galaxy guys - who is the expert to talk to on this and/or where
 in the code should we be looking?

 Thanks,

 Peter

 I think there's a bit of a mixup here - Peter, I believe you were asking
 if other composite types with an html primary dataset could be imported
 from the history to library, but Ulf, your test was the other direction
 (library-history). I'd be interested in knowing the outcome of the
 history-library test as well.

 Good catch - yes, that was what I was asking about. Ulf?

 I am woefully ignorant about the blastdbn datatype. Is the primary
 file supposed to be html type but empty?

 The BLAST databases are 'basic' composite datatypes, of which
 the most commonly used example is HTML (and some bits of
 the base class code code seem to assume HTML). This means
 testing if something works with HTML is a good first step.

 https://github.com/peterjc/galaxy_blast/tree/master/datatypes/blast_datatypes

 Peter


**
The information contained in the EMail and any attachments is confidential and 
intended solely and for the attention and use of the named addressee(s). It may 
not be disclosed to any other person without the express authority of Public 
Health England, or the intended recipient, or both. If you are not the intended 
recipient, you must not disclose, copy, distribute or retain this message or 
any part of it. This footnote also confirms that this EMail has been swept for 
computer viruses by Symantec.Cloud, but please re-sweep any attachments before 
opening or saving. http://www.gov.uk/PHE
**

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] Providing BLAST db in a data library

2014-07-28 Thread Peter Cock
On Mon, Jul 28, 2014 at 8:28 AM, Ulf Schaefer ulf.schae...@phe.gov.uk wrote:
 Dear Nate, dear Peter

 Sorry for the delay in replying.

 I can import both HTML and blastdb from a history to a data library. If
 I try to get the data out of the library into anothre history, I am
 successful for the html but not for the blastdb. The problem seems to be
 that the primary data file (the /path/dataset_12345.dat) is empty for
 the blastdb, while the html primary file has something in it.

OK. Can you tell where Galaxy thinks the library files are on disk,
and check to see if the folder of BLAST database files is actually
there?

 When I try to import the blastdb (from library to history) there is a
 message along the lines of can't import empty file. I hypothesise
 (admittedly without having looked at a line of code) that there is a
 test for file size 0 somewhere that is either altogether unnecessary or,
 more likely, does not take into account that for composite datatypes it
 might be completely legitimate for the primary file to be empty.

This guess makes sense - but I've not yet tried to trace through
the code either.

 Or is my primary blastdb file not supposed to be empty in the first
 place? I can blast against it just fine.

The BLAST databases do not define/populate a primary file, so
Galaxy seems to create a dummy empty file on its own. I have
wondered about altering the BLAST database datatype definition
to have a human readable text file as the primary file (i.e. the
information currently saved as a text log file when creating a
database).

 Thanks a lot for your help
 Ulf

You too - you've found an interesting bug...

Peter
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] Providing BLAST db in a data library

2014-07-24 Thread Nate Coraor
On Jul 23, 2014, at 6:42 AM, Peter Cock p.j.a.c...@googlemail.com wrote:

 Interesting hypothesis - you may well be right.
 
 Galaxy guys - who is the expert to talk to on this and/or where
 in the code should we be looking?
 
 Thanks,
 
 Peter

I think there's a bit of a mixup here - Peter, I believe you were asking if 
other composite types with an html primary dataset could be imported from the 
history to library, but Ulf, your test was the other direction 
(library-history). I'd be interested in knowing the outcome of the 
history-library test as well.

I am woefully ignorant about the blastdbn datatype. Is the primary file 
supposed to be html type but empty?

--nate

 
 On Wed, Jul 23, 2014 at 11:22 AM, Ulf Schaefer ulf.schae...@phe.gov.uk 
 wrote:
 Dear Peter
 
 Thanks for your reply.
 
 I can import an html report (e.g. FastQC output) successfully into a new
 history from a data library. But the .dat file for the html is not empty
 like the one for the blastdb. Makes me think that I could do this with a
 blast db as well, if only it would not check for size 0 at the time of
 importing it.
 
 Thanks
 Ulf
 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/
 
 To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


[galaxy-dev] Providing BLAST db in a data library

2014-07-23 Thread Ulf Schaefer
Dear all

I have several smallish BLAST databases that I would like to provide in 
a data library. I create them in a history with the makeblastdb tool and 
them try to add them to the library. I see that for each blast db there 
is an empty file created (like /path/dataset_12345.dat) and a folder 
with the same name (/path/dataset_12345_files/) that contains the actual 
db files (blastdb.n*).

In my library the blastdb shows up empty and I cannot import it back to 
another history. I does not seem to be aware of the _files folder, 
despite it being the right data type (blastdbn).

Any ideas what I am doing wrong?

Thanks a lot for your help
Ulf

**
The information contained in the EMail and any attachments is confidential and 
intended solely and for the attention and use of the named addressee(s). It may 
not be disclosed to any other person without the express authority of Public 
Health England, or the intended recipient, or both. If you are not the intended 
recipient, you must not disclose, copy, distribute or retain this message or 
any part of it. This footnote also confirms that this EMail has been swept for 
computer viruses by Symantec.Cloud, but please re-sweep any attachments before 
opening or saving. http://www.gov.uk/PHE
**

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] Providing BLAST db in a data library

2014-07-23 Thread Peter Cock
On Wed, Jul 23, 2014 at 10:47 AM, Ulf Schaefer ulf.schae...@phe.gov.uk wrote:
 Dear all

 I have several smallish BLAST databases that I would like to provide in
 a data library. I create them in a history with the makeblastdb tool and
 them try to add them to the library. I see that for each blast db there
 is an empty file created (like /path/dataset_12345.dat) and a folder
 with the same name (/path/dataset_12345_files/) that contains the actual
 db files (blastdb.n*).

 In my library the blastdb shows up empty and I cannot import it back to
 another history. I does not seem to be aware of the _files folder,
 despite it being the right data type (blastdbn).

 Any ideas what I am doing wrong?

 Thanks a lot for your help
 Ulf

Hi Ulf,

I've never tried that. It could be a bug in Galaxy importing
composite datatypes into a library, or something in the BLAST
database definition which needs fixing. Does importing an
HTML report (with child files like images) into a library work
for you? (This is another composite datatype so a useful
comparison).

Rather than using Data Libraries, we just list all the locally
installed shared BLAST databases via the BLAST *.loc
files instead.

Note using the *.loc files makes the databases available to
all the Galaxy users, while with a Data Library you can
control access to specific groups/roles.

Regards,

Peter
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] Providing BLAST db in a data library

2014-07-23 Thread Ulf Schaefer
Dear Peter

Thanks for your reply.

I can import an html report (e.g. FastQC output) successfully into a new 
history from a data library. But the .dat file for the html is not empty 
like the one for the blastdb. Makes me think that I could do this with a 
blast db as well, if only it would not check for size 0 at the time of 
importing it.

Thanks
Ulf

On 23/07/14 10:56, Peter Cock wrote:
 On Wed, Jul 23, 2014 at 10:47 AM, Ulf Schaefer ulf.schae...@phe.gov.uk 
 wrote:
 Dear all

 I have several smallish BLAST databases that I would like to provide in
 a data library. I create them in a history with the makeblastdb tool and
 them try to add them to the library. I see that for each blast db there
 is an empty file created (like /path/dataset_12345.dat) and a folder
 with the same name (/path/dataset_12345_files/) that contains the actual
 db files (blastdb.n*).

 In my library the blastdb shows up empty and I cannot import it back to
 another history. I does not seem to be aware of the _files folder,
 despite it being the right data type (blastdbn).

 Any ideas what I am doing wrong?

 Thanks a lot for your help
 Ulf

 Hi Ulf,

 I've never tried that. It could be a bug in Galaxy importing
 composite datatypes into a library, or something in the BLAST
 database definition which needs fixing. Does importing an
 HTML report (with child files like images) into a library work
 for you? (This is another composite datatype so a useful
 comparison).

 Rather than using Data Libraries, we just list all the locally
 installed shared BLAST databases via the BLAST *.loc
 files instead.

 Note using the *.loc files makes the databases available to
 all the Galaxy users, while with a Data Library you can
 control access to specific groups/roles.

 Regards,

 Peter


**
The information contained in the EMail and any attachments is confidential and 
intended solely and for the attention and use of the named addressee(s). It may 
not be disclosed to any other person without the express authority of Public 
Health England, or the intended recipient, or both. If you are not the intended 
recipient, you must not disclose, copy, distribute or retain this message or 
any part of it. This footnote also confirms that this EMail has been swept for 
computer viruses by Symantec.Cloud, but please re-sweep any attachments before 
opening or saving. http://www.gov.uk/PHE
**

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


Re: [galaxy-dev] Providing BLAST db in a data library

2014-07-23 Thread Peter Cock
Interesting hypothesis - you may well be right.

Galaxy guys - who is the expert to talk to on this and/or where
in the code should we be looking?

Thanks,

Peter

On Wed, Jul 23, 2014 at 11:22 AM, Ulf Schaefer ulf.schae...@phe.gov.uk wrote:
 Dear Peter

 Thanks for your reply.

 I can import an html report (e.g. FastQC output) successfully into a new
 history from a data library. But the .dat file for the html is not empty
 like the one for the blastdb. Makes me think that I could do this with a
 blast db as well, if only it would not check for size 0 at the time of
 importing it.

 Thanks
 Ulf
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/