Re: [galaxy-dev] bowtie dataset pair input

2015-06-25 Thread Bjoern Gruening

Hi Ryan,

latest wrappers are here: 
https://github.com/galaxyproject/tools-devteam/tree/master/tools/bowtie2

And a PR would be great!

But as far as I can see this is already implemented and you can choose 
as option `Paired-end Dataset Collection`, isn't it?


Ciao,
Bjoern

On 24.06.2015 20:02, Ryan G wrote:
Hi all - It looks like bowtie's wrapper is working incorrectly for a 
list of dataset pairs.  Its expecting all the forward reads in one 
dataset list and the reverse reads in a separate dataset list.


Instead, I have a list of dataset pairs (for paired-end data). This 
cannot be provided to bowtie as input.  If I correct this, should I 
create a pull request for it?  Alternatively, does someone already 
have a corrected version of this?




___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
   https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
   http://galaxyproject.org/search/mailinglists/


___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

[galaxy-dev] problem with view of list collection in history view

2015-06-25 Thread Alexander Vowinkel
Hi Team,

my tool creates dynamically 96 datasets bundled into a list.
In the history I can see the number 96 in the top as hidden datasets
(6 shown, 96 hidden)

When I open the list, I just can see 64 items.

Now I run the job again and I have 96 more hidden items.
I open the new list and can see 66 items in that new list.

What is going on here?
Is that just a visual bug?
Or are my datasets affected?

Thanks,
Alexander

PS: I use postgres
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

Re: [galaxy-dev] list collection output - format set from input

2015-06-25 Thread Alexander Vowinkel
no - not a typo. The tool can process both.
It's just my naming, because I use it for fastq.
I'll change the help tag.

2015-06-25 3:44 GMT-05:00 Peter Cock p.j.a.c...@googlemail.com:

 Hi Alexander,

 If this wasn't a collection, I would expect  format_source to work
 (possibly also using metadata_source=fastq_input1), so perhaps
 this is a bug - John?

 Peter

 P.S. Your help caption and output label both say FASTQ, but the
 input also allows FASTA input. Typo?

 On Thu, Jun 25, 2015 at 2:39 AM, Alexander Vowinkel
 vowinkel.alexan...@gmail.com wrote:
  Hi,
 
  I have an input, that can be fasta,fastqsanger,fastqillumina:
 
  param name=fastq_input1 type=data
  format=fasta,fastqsanger,fastqillumina label=Select the fastq file
  help=Specify fastq file with reads/
 
 
  I have multiple outputfiles - bundled in a list collection:
 
  collection name=split_output type=list label=@OUTPUT_NAME_PREFIX@
 on
  ${on_string} (Fastq Collection) format_source=fastq_input1
  discover_datasets pattern=__name_and_ext__ directory=splits /
  /collection
 
 
  The format_source parameter doesn't work - the files in the list
 (extension
  fq) are of format fq
 
  How can I make it possible that they are fasta,fastqsanger,fastqillumina
  depending on fastq_input1?
 
  Thanks,
  Alexander
 
  ___
  Please keep all replies on the list by using reply all
  in your mail client.  To manage your subscriptions to this
  and other Galaxy lists, please use the interface at:
https://lists.galaxyproject.org/
 
  To search Galaxy mailing lists use the unified search at:
http://galaxyproject.org/search/mailinglists/

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

Re: [galaxy-dev] list collection output - format set from input

2015-06-25 Thread Alexander Vowinkel
Hi John,

yes - I created this hacky solution and it works.

I now tried what you said, but no success.
Code:


 collection name=split_output type=list label=@OUTPUT_NAME_PREFIX@
 on ${on_string} (Fastq Collection) format_source=fastq_input1
 discover_datasets pattern=sample_(?Plt;namegt;.+)\.fq
 directory=output /
 /collection


It puts the samples into the collection correctly, but doesn't set a data
type.

Weird enough: In the collection, the files even don't show any format
(in the small history view on the right). Is that normal?

Addon: How can I let the files in the collection have the same database
build
attached? The other file (log file) gets it.

Best,
Alexander


2015-06-25 9:12 GMT-05:00 John Chilton jmchil...@gmail.com:

 You are giving Galaxy mixed signals :). format_source will say to use
 the data type specified by the corresponding input - but

 discover_datasets pattern=__name_and_ext__ directory=splits /

 Is saying (with the pattern __name_and_ext__) read files of the form
 out1.fastq and assign the collection identifier to out1 and the
 extension/format to fastq.

 Two things should work - one of them will and the other might but should.

 One thing you can do is not override the extension in your pattern:

 discover_datasets pattern=__name__ directory=splits /

 or if you don't want to include the .fq in the output identifier (and
 you probably don't want to)

 discover_datasets pattern=(?Plt;namegt;.+)\.fq /

 If that doesn't work - it is a bug and please let me know and I will
 attempt to fix it.

 The hackier way to get this to work that I am more confident will work
 - is to drop the format_source - use the same pattern:

 discover_datasets pattern=__name_and_ext__ directory=splits /

 But just use a shell command or something to rename all files of the
 form *.fq to *.${fastq_input1.ext}.

 Hope this helps.

 -John


 On Thu, Jun 25, 2015 at 9:56 AM, Alexander Vowinkel
 vowinkel.alexan...@gmail.com wrote:
  no - not a typo. The tool can process both.
  It's just my naming, because I use it for fastq.
  I'll change the help tag.
 
  2015-06-25 3:44 GMT-05:00 Peter Cock p.j.a.c...@googlemail.com:
 
  Hi Alexander,
 
  If this wasn't a collection, I would expect  format_source to work
  (possibly also using metadata_source=fastq_input1), so perhaps
  this is a bug - John?
 
  Peter
 
  P.S. Your help caption and output label both say FASTQ, but the
  input also allows FASTA input. Typo?
 
  On Thu, Jun 25, 2015 at 2:39 AM, Alexander Vowinkel
  vowinkel.alexan...@gmail.com wrote:
   Hi,
  
   I have an input, that can be fasta,fastqsanger,fastqillumina:
  
   param name=fastq_input1 type=data
   format=fasta,fastqsanger,fastqillumina label=Select the fastq file
   help=Specify fastq file with reads/
  
  
   I have multiple outputfiles - bundled in a list collection:
  
   collection name=split_output type=list
 label=@OUTPUT_NAME_PREFIX@
   on
   ${on_string} (Fastq Collection) format_source=fastq_input1
   discover_datasets pattern=__name_and_ext__ directory=splits
 /
   /collection
  
  
   The format_source parameter doesn't work - the files in the list
   (extension
   fq) are of format fq
  
   How can I make it possible that they are
 fasta,fastqsanger,fastqillumina
   depending on fastq_input1?
  
   Thanks,
   Alexander
  
   ___
   Please keep all replies on the list by using reply all
   in your mail client.  To manage your subscriptions to this
   and other Galaxy lists, please use the interface at:
 https://lists.galaxyproject.org/
  
   To search Galaxy mailing lists use the unified search at:
 http://galaxyproject.org/search/mailinglists/
 
 

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

Re: [galaxy-dev] list collection output - format set from input

2015-06-25 Thread John Chilton
You are giving Galaxy mixed signals :). format_source will say to use
the data type specified by the corresponding input - but

discover_datasets pattern=__name_and_ext__ directory=splits /

Is saying (with the pattern __name_and_ext__) read files of the form
out1.fastq and assign the collection identifier to out1 and the
extension/format to fastq.

Two things should work - one of them will and the other might but should.

One thing you can do is not override the extension in your pattern:

discover_datasets pattern=__name__ directory=splits /

or if you don't want to include the .fq in the output identifier (and
you probably don't want to)

discover_datasets pattern=(?Plt;namegt;.+)\.fq /

If that doesn't work - it is a bug and please let me know and I will
attempt to fix it.

The hackier way to get this to work that I am more confident will work
- is to drop the format_source - use the same pattern:

discover_datasets pattern=__name_and_ext__ directory=splits /

But just use a shell command or something to rename all files of the
form *.fq to *.${fastq_input1.ext}.

Hope this helps.

-John


On Thu, Jun 25, 2015 at 9:56 AM, Alexander Vowinkel
vowinkel.alexan...@gmail.com wrote:
 no - not a typo. The tool can process both.
 It's just my naming, because I use it for fastq.
 I'll change the help tag.

 2015-06-25 3:44 GMT-05:00 Peter Cock p.j.a.c...@googlemail.com:

 Hi Alexander,

 If this wasn't a collection, I would expect  format_source to work
 (possibly also using metadata_source=fastq_input1), so perhaps
 this is a bug - John?

 Peter

 P.S. Your help caption and output label both say FASTQ, but the
 input also allows FASTA input. Typo?

 On Thu, Jun 25, 2015 at 2:39 AM, Alexander Vowinkel
 vowinkel.alexan...@gmail.com wrote:
  Hi,
 
  I have an input, that can be fasta,fastqsanger,fastqillumina:
 
  param name=fastq_input1 type=data
  format=fasta,fastqsanger,fastqillumina label=Select the fastq file
  help=Specify fastq file with reads/
 
 
  I have multiple outputfiles - bundled in a list collection:
 
  collection name=split_output type=list label=@OUTPUT_NAME_PREFIX@
  on
  ${on_string} (Fastq Collection) format_source=fastq_input1
  discover_datasets pattern=__name_and_ext__ directory=splits /
  /collection
 
 
  The format_source parameter doesn't work - the files in the list
  (extension
  fq) are of format fq
 
  How can I make it possible that they are fasta,fastqsanger,fastqillumina
  depending on fastq_input1?
 
  Thanks,
  Alexander
 
  ___
  Please keep all replies on the list by using reply all
  in your mail client.  To manage your subscriptions to this
  and other Galaxy lists, please use the interface at:
https://lists.galaxyproject.org/
 
  To search Galaxy mailing lists use the unified search at:
http://galaxyproject.org/search/mailinglists/


___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

Re: [galaxy-dev] Using API to identify all datasets that were part of a workflow?

2015-06-25 Thread Ben Bimber
the latter.  starting with a dataset, pull it's full history.  therefore if
it was created by running a simple single-step tool it's one step.  if it
was created as part of a workflow, grab that whole series of
steps/inputs/outputs.

i agree on the python/java bindings being out of date, but even when i was
scanning the JSON I wasnt able to see where I'd glean this information.
 the missing thing for me was always determining if a given dataset was
connected to a larger workflow.

-ben

On Thu, Jun 25, 2015 at 7:26 AM, John Chilton jmchil...@gmail.com wrote:

 Can you clarify one thing for me - are you attempting to break a
 workflow invocation into steps, and then jobs, and then inputs and
 outputs (so working from the workflow invocation) or are you trying to
 scan existing histories and find a workflow for each dataset (so
 working from the history id and workflow id maybe)?

 I feel like this should be doable now - though blend4j and to a lesser
 extent even bioblend are pretty far behind what I would consider best
 practices for invoking workflows via the API so they may need to be
 updated.

 -John


 On Thu, Jun 25, 2015 at 10:04 AM, Ben Bimber bbim...@gmail.com wrote:
  Hello,
 
  I'm still relatively new to galaxy.  I'm trying to use the API to
 identify
  the string of jobs/datasets that were created as part of executing a
  workflow.  So far as I can tell, the API gives me the ID of the job,
 which
  corresponds to one step in the workflow.  Each of these has
 inputs/outputs.
  I can walk outwards and try to connect any other jobs that happen to use
 one
  of these files as an input or output; however, I am not seeing any key
 that
  provides a more direct indication that a set of steps was executed as
 part
  of a given workflow.  Am I missing something?
 
  Thanks in advance,
  Ben
 
  ___
  Please keep all replies on the list by using reply all
  in your mail client.  To manage your subscriptions to this
  and other Galaxy lists, please use the interface at:
https://lists.galaxyproject.org/
 
  To search Galaxy mailing lists use the unified search at:
http://galaxyproject.org/search/mailinglists/

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

Re: [galaxy-dev] Using API to identify all datasets that were part of a workflow?

2015-06-25 Thread John Chilton
Can you clarify one thing for me - are you attempting to break a
workflow invocation into steps, and then jobs, and then inputs and
outputs (so working from the workflow invocation) or are you trying to
scan existing histories and find a workflow for each dataset (so
working from the history id and workflow id maybe)?

I feel like this should be doable now - though blend4j and to a lesser
extent even bioblend are pretty far behind what I would consider best
practices for invoking workflows via the API so they may need to be
updated.

-John


On Thu, Jun 25, 2015 at 10:04 AM, Ben Bimber bbim...@gmail.com wrote:
 Hello,

 I'm still relatively new to galaxy.  I'm trying to use the API to identify
 the string of jobs/datasets that were created as part of executing a
 workflow.  So far as I can tell, the API gives me the ID of the job, which
 corresponds to one step in the workflow.  Each of these has inputs/outputs.
 I can walk outwards and try to connect any other jobs that happen to use one
 of these files as an input or output; however, I am not seeing any key that
 provides a more direct indication that a set of steps was executed as part
 of a given workflow.  Am I missing something?

 Thanks in advance,
 Ben

 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:
   https://lists.galaxyproject.org/

 To search Galaxy mailing lists use the unified search at:
   http://galaxyproject.org/search/mailinglists/
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

Re: [galaxy-dev] Data Collections

2015-06-25 Thread John Chilton
For the list sake - I think we figured this out and IRC and it had to
do with having two versions of Galaxy installed on the same machine.
Alexander - let me know if this issue is not resolved.

-John

On Mon, Jun 15, 2015 at 4:02 PM, Alexander Vowinkel
vowinkel.alexan...@gmail.com wrote:
 Thank you for this detailed descriptions!

 I already have a followup question.
 I'm working on Galaxy Cloudman:

 Galaxy is at revision: 93cda3eb81 (master branch) from 11 Jun 2015)


 But I just can find Build dataset pair|list, not List of Dataset Pairs
 like
 in the video. At what version is that implemented?

 Best,
 Alexander

 2015-06-15 10:17 GMT-05:00 John Chilton jmchil...@gmail.com:

 On Wed, Jun 10, 2015 at 4:04 PM, Alexander Vowinkel
 vowinkel.alexan...@gmail.com wrote:
  Hi Folks,
 
  thank you so far for the previous help. I got much further.
  Now I'm stuck with data collections.
 
  Because this is quite a list, I appreciate also answers to parts of my
  questions ;)
 
  I have two issues:
  A) manual definition of data collections (any type) by user and/or admin
  B) definition of data collections as input/output of a tool and inside a
  workflow
 
 
  A) manual
  Basically I would like to create
  i) a list of fastq files (unpaired)
  ii) a paired set of two fastq files
  iii) a list of each two paired fastq files
 
  How can I do that?
  By using the web app? As user? As admin?
  By working via ssh on the server?

 So each of these got much easier/more robust with the most recent release.

 For the user perspective - for any of these options you will want to
 load the fastq files into a history, open the manage multiple datasets
 option
 (https://wiki.galaxyproject.org/Histories#Managing_Multiple_Datasets_Easily),
 select the datasets, and then choose the list type from the menu. Each
 will cause a widget to pop up allowing you to group the datasets (into
 a list, a pair, or a list of pairs  depending on your selection).

 The most complicated option is the list of pairs - this option is
 demonstrated in a the first video in Anton's recent NGS 101 -
 Reference-based RNA-seq series
 (https://vimeo.com/channels/884356/128265983). More information at
 https://wiki.galaxyproject.org/Learn/GalaxyNGS101.

 For all user-centric scenarios - you will need to get the plain
 datasets into a history first. FTP upload for instance doesn't support
 creating collections directly - you can import datasets and then
 create them. Likewise - data libraries do not currently support
 dataset collections. I believe there are Trello cards for both of
 these issues.

 For admins - there is a dataset collection API - I can point you at
 examples if you want - but this doesn't seem to be your interest.

 
 
  B) in tool/workflow
  Here I also have different approaches I would like to realize:
  i) use a collection as input for a tool
  ii) create a collection as output of a tool
  ii.1) from known # of output parameters
  ii.2) from unknown # of output parameters
 
  For these things I was trying to find some tools in toolshed to see how
  they
  do it, but I couldn't quite adopt it.

 I would look in the following directory instead of the tool shed -
 https://github.com/galaxyproject/galaxy/tree/dev/test/functional/tools.
 These are the tools used to drive the testing of the collections
 implementation and contain some very stripped down examples of what is
 possible.

 
  i) use a collection as input for a tool
  this is good documented - realizable by type=data_collection and the
  collection_type.
  Unfortunately I can't test this because I can't create a collection so
  far
  ;) - see A

 Indeed :). Here some good examples are like the tools in the RNA-seq
 pipeline - Tophat, Bowtie2, etc

 
  ii) create a collection as output of a tool
  Here it gets blurry for me.

 So one can get very far without ever creating an output from a tool
 explicitly. I contend most of the time - if you have a list of bam
 files and you want to create another list of bam files - you just want
 to map some operation over them. This is demonstrated in that RNA-seq
 outline - and talked about in a more theoretical way in my GCC talk
 from last year http://bit.ly/gcc2014workflows.

 There are definitely cases when you want to explicitly create
 collections though - the current best documentation on this is going
 to be the pull request that added them - not the implementation but
 the description which actually lays out these same categories and how
 to handle them with explicit complete examples.

 https://bitbucket.org/galaxy/galaxy-central/pull-request/634/allow-tools-to-explicitly-produce-dataset

 Hopefully this helps - please follow up with additional questions as
 you have them. I am keen to see more developers leveraging dataset
 collections.

 Thanks a bunch.
 -John

 
  ii.1) from known # of output parameters
  Here I didn't find a tool. I just thought, it might be a simpler case
  than
  ii.2 and
  good to understand the concept.
 

[galaxy-dev] Using API to identify all datasets that were part of a workflow?

2015-06-25 Thread Ben Bimber
Hello,

I'm still relatively new to galaxy.  I'm trying to use the API to identify
the string of jobs/datasets that were created as part of executing a
workflow.  So far as I can tell, the API gives me the ID of the job, which
corresponds to one step in the workflow.  Each of these has
inputs/outputs.  I can walk outwards and try to connect any other jobs that
happen to use one of these files as an input or output; however, I am not
seeing any key that provides a more direct indication that a set of steps
was executed as part of a given workflow.  Am I missing something?

Thanks in advance,
Ben
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

Re: [galaxy-dev] bowtie dataset pair input

2015-06-25 Thread John Chilton
What Bjoern said - unless you meant the older bowtie1 wrappers. Those
have not been updated - I think we decision was made at Penn State to
focus on bowtie2 - but if people are still interested in enhancing the
bowtie1 wrappers I think a PR would be welcome. There have been some
other relatively recent bowtie1 PRs (e.g. from Nicola
https://github.com/galaxyproject/tools-devteam/pull/49).

-John

On Thu, Jun 25, 2015 at 3:46 AM, Bjoern Gruening
bjoern.gruen...@gmail.com wrote:
 Hi Ryan,

 latest wrappers are here:
 https://github.com/galaxyproject/tools-devteam/tree/master/tools/bowtie2
 And a PR would be great!

 But as far as I can see this is already implemented and you can choose as
 option `Paired-end Dataset Collection`, isn't it?

 Ciao,
 Bjoern

 On 24.06.2015 20:02, Ryan G wrote:

 Hi all - It looks like bowtie's wrapper is working incorrectly for a list of
 dataset pairs.  Its expecting all the forward reads in one dataset list and
 the reverse reads in a separate dataset list.

 Instead, I have a list of dataset pairs (for paired-end data).  This cannot
 be provided to bowtie as input.  If I correct this, should I create a pull
 request for it?  Alternatively, does someone already have a corrected
 version of this?



 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:
   https://lists.galaxyproject.org/

 To search Galaxy mailing lists use the unified search at:
   http://galaxyproject.org/search/mailinglists/



 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:
   https://lists.galaxyproject.org/

 To search Galaxy mailing lists use the unified search at:
   http://galaxyproject.org/search/mailinglists/
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

Re: [galaxy-dev] samtools dependency changes ?

2015-06-25 Thread Bjoern Gruening

Hi Wolfgang,

I only can tell you that we also have problems with handling BAM files 
properly in Galaxy.
Our issue is more due to unsorted BAM files, but as far as I understood 
this is because the metadata creation changed from using samtools to 
using pysam. Maybe this helps you in finding a workaround.


Ciao,
Bjoern


hmm, no replies yet, so is anybody able to reproduce this behavior and 
would you not consider it a bug?


Best,
Wolfgang


On 06/16/2015 03:11 PM, Wolfgang Maier wrote:

Dear all,

with older Galaxies (prior to latest_15.03 I think), you could satisfy
Galaxy's samtools dependence for indexing bam files by having a samtools
executable in tool-dependencies/samtools/default/bin (with the
tool-dependencies directory declared as tool_dependency_dir in
galaxy.ini of course).

Now (checked with latest_15.03 and .05), this is not working any more!
The executable will still be used during bam uploads, but not when a bam
file gets created by a tool.

The reason is that before the job runner (tested this with the local job
runner only) used to build the dependency shell command for dependency
'samtools' before finishing a job, but now the job wrapper finish method
fails because it naively expects to find samtools on $PATH.

Best,
Wolfgang

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
 https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
 http://galaxyproject.org/search/mailinglists/


___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
 https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
 http://galaxyproject.org/search/mailinglists/

Re: [galaxy-dev] input dataset(s) and collections

2015-06-25 Thread Ryan G
If the user selects mulitple pairs of paired-end data, I want that
submitted the same way as a list of paired-end data.  I don't want a
separate job for each paired-end data.  Rather, I want a single job to
consume the entire list.

It seems to be easier to disable the ability to select multiple fastq1s and
fastq2s, and only allow the user to specify a list of paired-end reads.
This way, I can guarantee the pairs of fastq files match up correctly.

On Thu, Jun 25, 2015 at 10:41 AM, John Chilton jmchil...@gmail.com wrote:

 So you don't want the multi-run options to appear next to these
 inputs? There is currently no way for the tool author to disable this.
 I will admit that I am skeptical this is a good idea, fundamentally is
 feels like tool authors should not be able to prevent end users from
 running the tool multiple times in parallel. Though I will also admit
 in practice there are times this option leads to confusion for users
 about what will actually happen and multiple people have requested an
 option to disable it.


 http://dev.list.galaxyproject.org/Tool-development-Selecting-a-single-item-from-input-dataset-td4666447.html
 https://trello.com/c/qCtBBB8n

 -John


 On Tue, Jun 23, 2015 at 12:58 PM, Ryan G ngsbioinformat...@gmail.com
 wrote:
  Hi all - I'm constructing a wrapper for a tool I have and the input to
 the
  tool can be:
 
  1)  a single fastq file (single end sample)
  2)  multiple singled-end fastq files
  3)  a single paired-end sample
  4)  multiple paired-end samples.
 
  I have cases #1 and #2 handled, however case #3 is presenting a
 problem.  If
  the user select Paired-End sample, I want to restrict them to selecting
  only a single fastq1 and fastq2 file.
 
  If they want to submit multiple paired-end samples I only want to allow
 them
  to submit them  as a list of dataset pairs.  I can get this part to work.
 
  I just need to restrict users to selecting only a single fastq1 and
 fastq2
  file when the input_type is Paired-End.
 
  My XML is as follows.  Any help would be appreciated
 
  !-- Input FastQ file(s) --
  conditional name=input_type
param name=input_type_selector type=select label=Select input
  type help=Select between single and paired fastq data
  option value=singleSingle-End/option
  option value=pairedPaired-End/option
  option value=paired_collectionPaired Collection/option
/param
when value=single
  param name=fastq_input1 type=data format=fastq
 label=Select
  fastq dataset multiple=true help=Specify dataset with single end
 reads
  /
/when
when value=paired
  param name=fastq_input1 type=data format=fastq
 label=Select
  fastq dataset help=Specify dataset with 1st of paired-end reads /
  param name=fastq_input2 type=data format=fastq
 label=Select
  fastq dataset help=Specify dataset with 2nd of paired-end reads /
/when
when value=paired_collection
  param name=fastq_collection type=data_collection
  collection_type=list:paired label=Select a paired collection
 help=See
  help section for an explanation of dataset collections/
/when
  /conditional
 
 
  ___
  Please keep all replies on the list by using reply all
  in your mail client.  To manage your subscriptions to this
  and other Galaxy lists, please use the interface at:
https://lists.galaxyproject.org/
 
  To search Galaxy mailing lists use the unified search at:
http://galaxyproject.org/search/mailinglists/

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

Re: [galaxy-dev] problem with view of list collection in history view

2015-06-25 Thread John Chilton
Conversation in IRC. tl;dr - it looks like it might be a GUI related
problem since the API does contain all of the datasets. Carl - any
chance you have an idea of what is going on here?

21:20  jmchilton avowinkel: is it possible there were duplicated
identifiers (has your discover_datasets pattern
   changed from earlier)
21:21  jmchilton I'm leaning toward saying it is likely a backend
problem - since explicit output collections are
   pretty new and you are the first person I can think
of really exercising them strenuously
21:21  jmchilton One way to verify though is to check the API - if
you just open localhost:port/api/histories
   in your browser - find the history id
21:21  jmchilton then open /api/histories/history_id/contents and
then find the collection
21:22  jmchilton you should be able to open something like
21:22  jmchilton
/api/histories/history_id/contents/collections/collection_id -
which should show the
   individual datasets
21:23  avowinkel there are defenitely no duplicate designations - if
thats the same like identifiers
21:23  avowinkel It's still discover_datasets
pattern=__name_and_ext__ directory=splits /
21:27  jmchilton My next question would be (if you can verify it is
a backend thing) - are the elements in the
   dataset - the hidden elements less than a certain
HID - or are they random.
21:28  avowinkel via the api all 96 entries are in the collection
21:29  avowinkel with element_index's 0 to 95, in total 96
21:29  avowinkel in both lists
21:31  avowinkel biggest hid is 202
21:32  avowinkel the parent's list hid is always smaller than the
containing element's hids
22:26  jmchilton so you are sure every element_index from 0 to 95 is
represented? This being a GUI problem is
   really odd - but it seems like it probably is. I
wonder if someone a div id is generated from
   the identifiers in such a way that one is
duplicated. Seems unlikely
22:26  jmchilton Can you open your JavaScript console and see if
there are any JavaScript errors/
22:27  avowinkel well. I did grep element_index, I saw index 0 on
the top, Index 95 on the bottom. and wc -l
   gives 96 - so yes. very sure
22:28  avowinkel and when I scan loosely through the list of greps,
I don't see anything odd
22:29  avowinkel don't want to count from 0 to 95 ^^
22:29  jmchilton :)
22:29  avowinkel for all the tests
22:30  jmchilton does that API response have a hidden field for the datasets?
22:31  avowinkel there is nothing in that file that matches hidden
22:31  avowinkel (in the history they are all hidden)
22:32  jmchilton I would open your web browser and check for
javascript errors next
22:34  avowinkel nop. nothing (Firefox 34 - ubuntu biolinux)
22:34  jmchilton can you send me a screenshot of the expanded collection?
22:35  avowinkel the newest run has 69 entries in the history
22:36  avowinkel what part do you want screenshotted?
22:38  jmchilton When I open the list, I just can see 64 items.
The opened list in the history panel
22:40  avowinkel http://snag.gy/2knoI.jpg
22:43  jmchilton are you hand counting these lists in the browswer then?
22:47  avowinkel yes, hand counting
22:50  jmchilton I'll ping carl about this - he is the GUI
mastermind - he might have some clue
22:59  avowinkel jmchilton: http://pastebin.com/DcpF1QAU
22:59  mrscribe Title: [YAML] galaxy dataset_collection contents -
Pastebin.com (at pastebin.com)
23:00  avowinkel don't get confused: On the picture is a different
dataset. It doesn't have sample_ in the name
23:04  jmchilton yeah - that response looks perfectly fine - really odd

-John

On Thu, Jun 25, 2015 at 4:58 PM, Alexander Vowinkel
vowinkel.alexan...@gmail.com wrote:
 Hi Team,

 my tool creates dynamically 96 datasets bundled into a list.
 In the history I can see the number 96 in the top as hidden datasets
 (6 shown, 96 hidden)

 When I open the list, I just can see 64 items.

 Now I run the job again and I have 96 more hidden items.
 I open the new list and can see 66 items in that new list.

 What is going on here?
 Is that just a visual bug?
 Or are my datasets affected?

 Thanks,
 Alexander

 PS: I use postgres

 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:
   https://lists.galaxyproject.org/

 To search Galaxy mailing lists use the unified search at:
   http://galaxyproject.org/search/mailinglists/
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/