Re: [galaxy-dev] BED12 Data Format

2016-02-27 Thread Lance Parsons
So I'm continuing to run into issues with BED vs BED12 files due to 
Galaxy automatically converting GFF/GTF files to non-BED12 files. 
Essentially, the RSeQC tools require BED12 files, but Galaxy allows 
users to use GTF/GFF files which it automatically converts incorrectly.


Is there any desire/support on the part of the Galaxy team to allow 
tools to use/require the BED12 format?


If not, I'll work on incorporating the conversion(s) within each tool 
wrapper, unless someone has an alternate suggestion.


Thanks all,

Lance


Lance Parsons 
January 29, 2016 at 11:03 AM
Well, I somewhat follow you. My main goal would be to allow people to 
use GTF files in the RSeQC tools. My initial thought was to write a 
tool to convert GTF to BED12 (which I've done). However, it would be 
really nice to have Galaxy be able to automatically convert behind the 
scenes. The problem there is that it already does conversion from GTF 
to BED, but not to BED12. Another option would be to include the 
conversion as part of the tools, but that is kinda messy and doesn't 
help any other tools that need the same thing.


I'm open to suggestions on how to handle this, but right now, the only 
option I can see is to build the conversion in as part of the tool, 
correct? I'm not quite sure why allowing users to specify BED12 vs 
BED6 vs "BED" is any worse than the way things work with other 
datatypes (fastq, fastqsanger, etc.), but I realize you guys have a 
lot of experience with users that I don't.


Lance

Daniel Blankenberg wrote:

Daniel Blankenberg 
January 28, 2016 at 11:55 AM
Hi Lance,

Ah, yes, using bed12 would be problematic, as all bedstrict types are 
currently set to not be uploadable or datatype assignable. This stems 
from a long history of abuse of the BED format in Galaxy (where 
datasets should have been generic 'interval' matching bed metadata of 
chrome,start,end of 1,2,3). I added these datatypes to force 
conversions of bed/interval files to Real bed/6/12 files, especially 
as needed by external visualization tools --- that way you can click 
on a 'bed' (fake) file and Galaxy will convert it to Real bed and load 
the external service, or run the Galaxy tool on the properly formatted 
data.


Basically, there are restrictions in place to try to ensure that a 
bedstrict datatype is actually BED conforming. I am open to loosening 
these restrictions up, however, and allowing them to function as 
'normal' datatypes (users would be free to shoot themselves in the 
foot by mis-assigning the datatype). In the meantime, you should be 
able to test this by flipping 'allow_datatype_change' in interval.py 
to True.


Thoughts?


Thanks,

Dan





Lance Parsons 
January 27, 2016 at 3:38 PM
Thanks for the info Dan. I've explored using bed12, but I have a few 
questions.


1. When attempting to use 'bed12' or 'Bed12' as a file type in a test, 
I get the following error:


Exception: {u'message': {u'type': u'error', u'data': 
{u'file_type': u"An invalid option was selected for file_type, 
u'Bed12', please verify.", u'files_metadata': [u"An invalid option was 
selected for file_type, u'Bed12', please verify."]}}}


2. I see that Bed12 is in the datatypes_conf.sample file. Is there a 
way to add a converter for that datatype? Perhaps something like: 
https://wiki.galaxyproject.org/ToolShedDatatypesFeatures#Including_datatype_converters_and_display_applications? 
My concern is that since it already exists, I wouldn't be able to add 
a converter. Also, the sniffer doesn't seem to work (it just finds the 
files as "bed", thus my desire to specify ftype in tests).


Thanks,
Lance

Daniel Blankenberg wrote:


Hi Lance,

FWIW, there is an existing bedstrict and bed12 (and bed6) datatypes 
in Galaxy. The strict datatypes are currently usually created by 
implicit datatype converters and are most often used by some external 
display applications that need standards conforming files.  bed6/12 
are subclasses of bedstrict. They can of course be consumed or 
created by any sort of tool. Please let us know if we can provide 
additional information.



Thanks for using Galaxy,

Dan


On Jan 14, 2016, at 4:26 PM, Lance Parsons  
wrote:




Does anyone know of any efforts to create a BED12 datatype for 
Galaxy? Since some tools require BED12 and the automatic convertion 
from GFF-to-BED does not seem to generate a BED12, it seems it might 
be a worthwhile addition.


If not, what would be the best way to go about doing this? Making it 
part of the core galaxy (which would allow multiple tools to share 
the same data type definitions) or making it part of a toolshed tool 
(which I'm not sure how to do)? BTW, I'm thinking about RSeQC at the 
moment, but I know other tools use/require this format.


--
Lance Parsons - Scientific Programmer
Carl C. Icahn Laboratory - Room 141 (Temporary)
Lewis-Sigler Institute for Integrative Genomics
Princeton Universi

Re: [galaxy-dev] BED12 Data Format

2016-01-29 Thread Lance Parsons
Well, I somewhat follow you. My main goal would be to allow people to 
use GTF files in the RSeQC tools. My initial thought was to write a tool 
to convert GTF to BED12 (which I've done). However, it would be really 
nice to have Galaxy be able to automatically convert behind the scenes. 
The problem there is that it already does conversion from GTF to BED, 
but not to BED12. Another option would be to include the conversion as 
part of the tools, but that is kinda messy and doesn't help any other 
tools that need the same thing.


I'm open to suggestions on how to handle this, but right now, the only 
option I can see is to build the conversion in as part of the tool, 
correct? I'm not quite sure why allowing users to specify BED12 vs BED6 
vs "BED" is any worse than the way things work with other datatypes 
(fastq, fastqsanger, etc.), but I realize you guys have a lot of 
experience with users that I don't.


Lance

Daniel Blankenberg wrote:

Hi Lance,

Ah, yes, using bed12 would be problematic, as all bedstrict types are 
currently set to not be uploadable or datatype assignable. This stems 
from a long history of abuse of the BED format in Galaxy (where 
datasets should have been generic 'interval' matching bed metadata of 
chrome,start,end of 1,2,3). I added these datatypes to force 
conversions of bed/interval files to Real bed/6/12 files, especially 
as needed by external visualization tools --- that way you can click 
on a 'bed' (fake) file and Galaxy will convert it to Real bed and load 
the external service, or run the Galaxy tool on the properly formatted 
data.


Basically, there are restrictions in place to try to ensure that a 
bedstrict datatype is actually BED conforming. I am open to loosening 
these restrictions up, however, and allowing them to function as 
'normal' datatypes (users would be free to shoot themselves in the 
foot by mis-assigning the datatype). In the meantime, you should be 
able to test this by flipping 'allow_datatype_change' in interval.py 
to True.


Thoughts?


Thanks,

Dan



On Jan 27, 2016, at 3:38 PM, Lance Parsons > wrote:


Thanks for the info Dan. I've explored using bed12, but I have a few 
questions.


1. When attempting to use 'bed12' or 'Bed12' as a file type in a 
test, I get the following error:


Exception: {u'message': {u'type': u'error', u'data': 
{u'file_type': u"An invalid option was selected for file_type, 
u'Bed12', please verify.", u'files_metadata': [u"An invalid option 
was selected for file_type, u'Bed12', please verify."]}}}


2. I see that Bed12 is in the datatypes_conf.sample file. Is there a 
way to add a converter for that datatype? Perhaps something 
like:https://wiki.galaxyproject.org/ToolShedDatatypesFeatures#Including_datatype_converters_and_display_applications? 
My concern is that since it already exists, I wouldn't be able to add 
a converter. Also, the sniffer doesn't seem to work (it just finds 
the files as "bed", thus my desire to specify ftype in tests).


Thanks,
Lance

Daniel Blankenberg wrote:


Hi Lance,

FWIW, there is an existing bedstrict and bed12 (and bed6) datatypes 
in Galaxy. The strict datatypes are currently usually created by 
implicit datatype converters and are most often used by some 
external display applications that need standards conforming files.  
bed6/12 are subclasses of bedstrict. They can of course be consumed 
or created by any sort of tool. Please let us know if we can provide 
additional information.



Thanks for using Galaxy,

Dan


On Jan 14, 2016, at 4:26 PM, Lance Parsons  
wrote:




Does anyone know of any efforts to create a BED12 datatype for 
Galaxy? Since some tools require BED12 and the automatic convertion 
from GFF-to-BED does not seem to generate a BED12, it seems it 
might be a worthwhile addition.


If not, what would be the best way to go about doing this? Making 
it part of the core galaxy (which would allow multiple tools to 
share the same data type definitions) or making it part of a 
toolshed tool (which I'm not sure how to do)? BTW, I'm thinking 
about RSeQC at the moment, but I know other tools use/require this 
format.


--
Lance Parsons - Scientific Programmer
Carl C. Icahn Laboratory - Room 141 (Temporary)
Lewis-Sigler Institute for Integrative Genomics
Princeton University

___
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
http://galaxyproject.org/search/mailinglists/






Daniel Blankenberg 
January 25, 2016 at 9:47 AM
Hi Lance,

FWIW, there is an existing bedstrict and bed12 (and bed6) datatypes 
in Galaxy. The strict datatypes are currently usually created by 
implicit datatype converters and are most often used by some 
external display a

Re: [galaxy-dev] BED12 Data Format

2016-01-28 Thread Daniel Blankenberg
Hi Lance,

Ah, yes, using bed12 would be problematic, as all bedstrict types are currently 
set to not be uploadable or datatype assignable. This stems from a long history 
of abuse of the BED format in Galaxy (where datasets should have been generic 
‘interval’ matching bed metadata of chrome,start,end of 1,2,3). I added these 
datatypes to force conversions of bed/interval files to Real bed/6/12 files, 
especially as needed by external visualization tools — that way you can click 
on a ‘bed’ (fake) file and Galaxy will convert it to Real bed and load the 
external service, or run the Galaxy tool on the properly formatted data.

Basically, there are restrictions in place to try to ensure that a bedstrict 
datatype is actually BED conforming. I am open to loosening these restrictions 
up, however, and allowing them to function as ’normal' datatypes (users would 
be free to shoot themselves in the foot by mis-assigning the datatype). In the 
meantime, you should be able to test this by flipping ‘allow_datatype_change’ 
in interval.py to True.

Thoughts?


Thanks,

Dan



On Jan 27, 2016, at 3:38 PM, Lance Parsons  wrote:

> Thanks for the info Dan. I've explored using bed12, but I have a few 
> questions.
> 
> 1. When attempting to use 'bed12' or 'Bed12' as a file type in a test, I get 
> the following error: 
> 
> Exception: {u'message': {u'type': u'error', u'data': {u'file_type': u"An 
> invalid option was selected for file_type, u'Bed12', please verify.", 
> u'files_metadata': [u"An invalid option was selected for file_type, u'Bed12', 
> please verify."]}}}
> 
> 2. I see that Bed12 is in the datatypes_conf.sample file. Is there a way to 
> add a converter for that datatype? Perhaps something 
> like:https://wiki.galaxyproject.org/ToolShedDatatypesFeatures#Including_datatype_converters_and_display_applications?
>  My concern is that since it already exists, I wouldn't be able to add a 
> converter. Also, the sniffer doesn't seem to work (it just finds the files as 
> "bed", thus my desire to specify ftype in tests).
> 
> Thanks,
> Lance
> 
> Daniel Blankenberg wrote:
>> 
>> Hi Lance,
>> 
>> FWIW, there is an existing bedstrict and bed12 (and bed6) datatypes in 
>> Galaxy. The strict datatypes are currently usually created by implicit 
>> datatype converters and are most often used by some external display 
>> applications that need standards conforming files.  bed6/12 are subclasses 
>> of bedstrict. They can of course be consumed or created by any sort of tool. 
>> Please let us know if we can provide additional information.
>> 
>> 
>> Thanks for using Galaxy,
>> 
>> Dan
>> 
>> 
>> On Jan 14, 2016, at 4:26 PM, Lance Parsons  wrote:
>> 
>>> 
>>> Does anyone know of any efforts to create a BED12 datatype for Galaxy? 
>>> Since some tools require BED12 and the automatic convertion from GFF-to-BED 
>>> does not seem to generate a BED12, it seems it might be a worthwhile 
>>> addition.
>>> 
>>> If not, what would be the best way to go about doing this? Making it part 
>>> of the core galaxy (which would allow multiple tools to share the same data 
>>> type definitions) or making it part of a toolshed tool (which I'm not sure 
>>> how to do)? BTW, I'm thinking about RSeQC at the moment, but I know other 
>>> tools use/require this format.
>>> 
>>> -- 
>>> Lance Parsons - Scientific Programmer
>>> Carl C. Icahn Laboratory - Room 141 (Temporary)
>>> Lewis-Sigler Institute for Integrative Genomics
>>> Princeton University
>>> 
>>> ___
>>> Please keep all replies on the list by using "reply all"
>>> in your mail client.  To manage your subscriptions to this
>>> and other Galaxy lists, please use the interface at:
>>> https://lists.galaxyproject.org/
>>> 
>>> To search Galaxy mailing lists use the unified search at:
>>> http://galaxyproject.org/search/mailinglists/
>> 
> 
> 
>> Daniel Blankenberg January 25, 2016 at 9:47 AM
>> Hi Lance,
>> 
>> FWIW, there is an existing bedstrict and bed12 (and bed6) datatypes in 
>> Galaxy. The strict datatypes are currently usually created by implicit 
>> datatype converters and are most often used by some external display 
>> applications that need standards conforming files. bed6/12 are subclasses of 
>> bedstrict. They can of course be consumed or created by any sort of tool. 
>> Please let us know if we can provide additional information.
>> 
>> 
>> Thanks for using Galaxy,
>> 
>> Dan
>> 
>> 
>> 
>> Lance Parsons January 14, 2016 at 4:26 PM
>> Does anyone know of any efforts to create a BED12 datatype for Galaxy? Since 
>> some tools require BED12 and the automatic convertion from GFF-to-BED does 
>> not seem to generate a BED12, it seems it might be a worthwhile addition. 
>> 
>> If not, what would be the best way to go about doing this? Making it part of 
>> the core galaxy (which would allow multiple tools to share the same data 
>> type definitions) or making it part of a toolshed tool (which I'm not sure 
>> how

Re: [galaxy-dev] BED12 Data Format

2016-01-27 Thread Lance Parsons
Thanks for the info Dan. I've explored using bed12, but I have a few 
questions.


1. When attempting to use 'bed12' or 'Bed12' as a file type in a test, I 
get the following error:


Exception: {u'message': {u'type': u'error', u'data': {u'file_type': 
u"An invalid option was selected for file_type, u'Bed12', please 
verify.", u'files_metadata': [u"An invalid option was selected for 
file_type, u'Bed12', please verify."]}}}


2. I see that Bed12 is in the datatypes_conf.sample file. Is there a way 
to add a converter for that datatype? Perhaps something like: 
https://wiki.galaxyproject.org/ToolShedDatatypesFeatures#Including_datatype_converters_and_display_applications? 
My concern is that since it already exists, I wouldn't be able to add a 
converter. Also, the sniffer doesn't seem to work (it just finds the 
files as "bed", thus my desire to specify ftype in tests).


Thanks,
Lance

Daniel Blankenberg wrote:


Hi Lance,

FWIW, there is an existing bedstrict and bed12 (and bed6) datatypes in 
Galaxy. The strict datatypes are currently usually created by implicit 
datatype converters and are most often used by some external display 
applications that need standards conforming files.  bed6/12 are 
subclasses of bedstrict. They can of course be consumed or created by 
any sort of tool. Please let us know if we can provide additional 
information.



Thanks for using Galaxy,

Dan


On Jan 14, 2016, at 4:26 PM, Lance Parsons  wrote:



Does anyone know of any efforts to create a BED12 datatype for 
Galaxy? Since some tools require BED12 and the automatic convertion 
from GFF-to-BED does not seem to generate a BED12, it seems it might 
be a worthwhile addition.


If not, what would be the best way to go about doing this? Making it 
part of the core galaxy (which would allow multiple tools to share 
the same data type definitions) or making it part of a toolshed tool 
(which I'm not sure how to do)? BTW, I'm thinking about RSeQC at the 
moment, but I know other tools use/require this format.


--
Lance Parsons - Scientific Programmer
Carl C. Icahn Laboratory - Room 141 (Temporary)
Lewis-Sigler Institute for Integrative Genomics
Princeton University

___
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
http://galaxyproject.org/search/mailinglists/






Daniel Blankenberg 
January 25, 2016 at 9:47 AM
Hi Lance,

FWIW, there is an existing bedstrict and bed12 (and bed6) datatypes in 
Galaxy. The strict datatypes are currently usually created by implicit 
datatype converters and are most often used by some external display 
applications that need standards conforming files. bed6/12 are 
subclasses of bedstrict. They can of course be consumed or created by 
any sort of tool. Please let us know if we can provide additional 
information.



Thanks for using Galaxy,

Dan



Lance Parsons 
January 14, 2016 at 4:26 PM
Does anyone know of any efforts to create a BED12 datatype for Galaxy? 
Since some tools require BED12 and the automatic convertion from 
GFF-to-BED does not seem to generate a BED12, it seems it might be a 
worthwhile addition.


If not, what would be the best way to go about doing this? Making it 
part of the core galaxy (which would allow multiple tools to share the 
same data type definitions) or making it part of a toolshed tool 
(which I'm not sure how to do)? BTW, I'm thinking about RSeQC at the 
moment, but I know other tools use/require this format.




--
Lance Parsons - Scientific Programmer
Carl C. Icahn Laboratory - Room 141 (Temporary)
Lewis-Sigler Institute for Integrative Genomics
Princeton University

___
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

Re: [galaxy-dev] BED12 Data Format

2016-01-25 Thread Daniel Blankenberg
Hi Lance,

FWIW, there is an existing bedstrict and bed12 (and bed6) datatypes in Galaxy. 
The strict datatypes are currently usually created by implicit datatype 
converters and are most often used by some external display applications that 
need standards conforming files.  bed6/12 are subclasses of bedstrict. They can 
of course be consumed or created by any sort of tool. Please let us know if we 
can provide additional information.


Thanks for using Galaxy,

Dan


On Jan 14, 2016, at 4:26 PM, Lance Parsons  wrote:

> Does anyone know of any efforts to create a BED12 datatype for Galaxy? Since 
> some tools require BED12 and the automatic convertion from GFF-to-BED does 
> not seem to generate a BED12, it seems it might be a worthwhile addition.
> 
> If not, what would be the best way to go about doing this? Making it part of 
> the core galaxy (which would allow multiple tools to share the same data type 
> definitions) or making it part of a toolshed tool (which I'm not sure how to 
> do)? BTW, I'm thinking about RSeQC at the moment, but I know other tools 
> use/require this format.
> 
> -- 
> Lance Parsons - Scientific Programmer
> Carl C. Icahn Laboratory - Room 141 (Temporary)
> Lewis-Sigler Institute for Integrative Genomics
> Princeton University
> 
> ___
> Please keep all replies on the list by using "reply all"
> in your mail client.  To manage your subscriptions to this
> and other Galaxy lists, please use the interface at:
> https://lists.galaxyproject.org/
> 
> To search Galaxy mailing lists use the unified search at:
> http://galaxyproject.org/search/mailinglists/

___
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

[galaxy-dev] BED12 Data Format

2016-01-14 Thread Lance Parsons
Does anyone know of any efforts to create a BED12 datatype for Galaxy? 
Since some tools require BED12 and the automatic convertion from 
GFF-to-BED does not seem to generate a BED12, it seems it might be a 
worthwhile addition.


If not, what would be the best way to go about doing this? Making it 
part of the core galaxy (which would allow multiple tools to share the 
same data type definitions) or making it part of a toolshed tool (which 
I'm not sure how to do)? BTW, I'm thinking about RSeQC at the moment, 
but I know other tools use/require this format.


--
Lance Parsons - Scientific Programmer
Carl C. Icahn Laboratory - Room 141 (Temporary)
Lewis-Sigler Institute for Integrative Genomics
Princeton University

___
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
 https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
 http://galaxyproject.org/search/mailinglists/