Re: [galaxy-dev] Dynamic data library

2013-10-02 Thread Hans-Rudolf Hotz



On 10/01/2013 07:12 PM, Cole, Nathan (NIH/NCI) [C] wrote:

One final question as I dive into looking at these two methods:  can you expose whole 
hierarchies and directories using the from_file method or will this only work 
on an individual sample basis?


yes, you can.
as an example have a look at the affymetrix cel files we offer. See 
attachment for a screen shot and the coresponding fragment from the xml 
file:


  options
  option name=Affymetrix value=Affymetrix
option name=HumanGeneST10_TissueData 
value=HumanGeneST10_TissueData
  option type=meta_key name=MouseTP_Brain_01_mGENE.CEL 
value=/***/***/external/Affymetrix/HumanGeneST10_TissueData/MouseTP_Brain_01_mGENE.CEL/
  option type=meta_key name=MouseTP_Brain_02_mGENE.CEL 
value=/***/***/external/Affymetrix/HumanGeneST10_TissueData/MouseTP_Brain_02_mGENE.CEL/
  option type=meta_key name=MouseTP_Brain_03_mGENE.CEL 
value=/***/***/external/Affymetrix/HumanGeneST10_TissueData/MouseTP_Brain_03_mGENE.CEL/
  option type=meta_key name=MouseTP_Embryo_01_mGENE.CEL 
value=/***/***/external/Affymetrix/HumanGeneST10_TissueData/MouseTP_Embryo_01_mGENE.CEL/
  option type=meta_key name=MouseTP_Embryo_02_mGENE.CEL 
value=/***/***/external/Affymetrix/HumanGeneST10_TissueData/MouseTP_Embryo_02_mGENE.CEL/
  option type=meta_key name=MouseTP_Embryo_03_mGENE.CEL 
value=/***/***/external/Affymetrix/HumanGeneST10_TissueData/MouseTP_Embryo_03_mGENE.CEL/
  option type=meta_key name=MouseTP_Heart_01_mGENE.CEL 
value=/***/***/external/Affymetrix/HumanGeneST10_TissueData/MouseTP_Heart_01_mGENE.CEL/
  option type=meta_key name=MouseTP_Heart_02_mGENE.CEL 
value=/***/***/external/Affymetrix/HumanGeneST10_TissueData/MouseTP_Heart_02_mGENE.CEL/
  option type=meta_key name=MouseTP_Heart_03_mGENE.CEL 
value=/***/***/external/Affymetrix/HumanGeneST10_TissueData/MouseTP_Heart_03_mGENE.CEL/
  option type=meta_key name=MouseTP_Kidney_01_mGENE.CEL 
value=/***/***/external/Affymetrix/HumanGeneST10_TissueData/MouseTP_Kidney_01_mGENE.CEL/
  option type=meta_key name=MouseTP_Kidney_02_mGENE.CEL 
value=/***/***/external/Affymetrix/HumanGeneST10_TissueData/MouseTP_Kidney_02_mGENE.CEL/
  option type=meta_key name=MouseTP_Kidney_03_mGENE.CEL 
value=/***/***/external/Affymetrix/HumanGeneST10_TissueData/MouseTP_Kidney_03_mGENE.CEL/

//
  option type=meta_key name=MouseTP_Thymus_03_mGENE.CEL 
value=/***/***/external/Affymetrix/HumanGeneST10_TissueData/MouseTP_Thymus_03_mGENE.CEL/

/option
option name=MouseGeneST10_TissueData 
value=MouseGeneST10_TissueData
  option type=meta_key name=MouseTP_Brain_01_mGENE.CEL 
value=/***/***/external/Affymetrix/MouseGeneST10_TissueData/MouseTP_Brain_01_mGENE.CEL/
  option type=meta_key name=MouseTP_Brain_02_mGENE.CEL 
value=/***/***/external/Affymetrix/MouseGeneST10_TissueData/MouseTP_Brain_02_mGENE.CEL/

//
  option type=meta_key name=MouseTP_Thymus_03_mGENE.CEL 
value=/***/***/external/Affymetrix/MouseGeneST10_TissueData/MouseTP_Thymus_03_mGENE.CEL/

/option
  /option
  option name=GEO value=GEO
//
  /option
/options





If not, is there any method for exposing a the whole of a directory on the file 
system?

Thanks,
Nathan


-Original Message-
From: Hans-Rudolf Hotz [mailto:h...@fmi.ch]
Sent: Tuesday, October 01, 2013 10:11 AM
To: Cole, Nathan (NIH/NCI) [C]
Cc: 'galaxy-dev@lists.bx.psu.edu'
Subject: Re: [galaxy-dev] Dynamic data library



On 10/01/2013 03:53 PM, Cole, Nathan (NIH/NCI) [C] wrote:

Thank you both for your responses.  I will be looking into both of these.

With regard to the from_file option to add the sample selection into the tool:  
I assume this means that the metadata and everything is loaded into galaxy at 
the time the tool is run.


This depends on how you write your tool. Do you just wanna read the ie fastq 
file or do you also wanna read the meta data. Also, how is the meta data 
accessible? eg. is it stored in a txt file at the same location as the fastq 
file?

   Does this create a copy of the loaded file or simply read it in place?  
Also are there any efficiency issues created using this method, outside of the 
tool run time increase due to the load of the data taking place in-tool?

It should just read it in place

Hans-Rudolf



Thanks,
Nathan

-Original Message-
From: Hans-Rudolf Hotz [mailto:h...@fmi.ch]
Sent: Tuesday, October 01, 2013 4:07 AM
To: Cole, Nathan (NIH/NCI) [C]
Cc: Martin Čech; galaxy-dev@lists.bx.psu.edu
Subject: Re: [galaxy-dev] Dynamic data library

Hi Nathan


Do you have many tools working with those samples or just a few? If you only 
have a limited, predefined set of tools you might wanna consider adding the 
sample selection into the tool.

You can use the from_file, or from_data_table options to dynamically
create sample selection list. You can even drill down a hierarchical
list. Have a look at
~/tools

Re: [galaxy-dev] Dynamic data library

2013-10-02 Thread Cole, Nathan (NIH/NCI) [C]
One final question as I dive into looking at these two methods:  can you expose 
whole hierarchies and directories using the from_file method or will this 
only work on an individual sample basis?

If not, is there any method for exposing a the whole of a directory on the file 
system?

Thanks,
Nathan


-Original Message-
From: Hans-Rudolf Hotz [mailto:h...@fmi.ch] 
Sent: Tuesday, October 01, 2013 10:11 AM
To: Cole, Nathan (NIH/NCI) [C]
Cc: 'galaxy-dev@lists.bx.psu.edu'
Subject: Re: [galaxy-dev] Dynamic data library



On 10/01/2013 03:53 PM, Cole, Nathan (NIH/NCI) [C] wrote:
 Thank you both for your responses.  I will be looking into both of these.

 With regard to the from_file option to add the sample selection into the 
 tool:  I assume this means that the metadata and everything is loaded into 
 galaxy at the time the tool is run.

This depends on how you write your tool. Do you just wanna read the ie fastq 
file or do you also wanna read the meta data. Also, how is the meta data 
accessible? eg. is it stored in a txt file at the same location as the fastq 
file?

  Does this create a copy of the loaded file or simply read it in place?  Also 
  are there any efficiency issues created using this method, outside of the 
  tool run time increase due to the load of the data taking place in-tool?

It should just read it in place

Hans-Rudolf


 Thanks,
 Nathan

 -Original Message-
 From: Hans-Rudolf Hotz [mailto:h...@fmi.ch]
 Sent: Tuesday, October 01, 2013 4:07 AM
 To: Cole, Nathan (NIH/NCI) [C]
 Cc: Martin Čech; galaxy-dev@lists.bx.psu.edu
 Subject: Re: [galaxy-dev] Dynamic data library

 Hi Nathan


 Do you have many tools working with those samples or just a few? If you only 
 have a limited, predefined set of tools you might wanna consider adding the 
 sample selection into the tool.

 You can use the from_file, or from_data_table options to dynamically 
 create sample selection list. You can even drill down a hierarchical 
 list. Have a look at 
 ~/tools/annotation_profiler/annotation_profiler.xml
 which uses the file
 ~/tool-data/annotation_profiler_options.xml

 All you need to do is keeping the file in sync with the directory 
 structure of your samples directory


 Regards, Hans-Rudolf






 On 09/30/2013 09:48 PM, Martin Čech wrote:
 Hi Nathan,

 Dannon answered similar question few days ago:

  There's an import mechanism in libraries that'll allow you to simply
  link to the file on disk without copy/upload.  I believe the
  example_watch_folder.py sample script (in the distribution) does
  just this via the API, if you want an example.


 This might be what you are looking for.

 Martin


 On Mon, Sep 30, 2013 at 2:43 PM, Cole, Nathan (NIH/NCI) [C] 
 nathan.c...@nih.gov mailto:nathan.c...@nih.gov wrote:

  Hello, we’ve set up a local Galaxy instance in our genotyping and
  next-gen sequencing lab with local Apache LDAP (AD) integration, NFS
  mounts to a large NAS, and cluster integration coming.  Do to the
  high volume of samples and staff that will be using the system, I
  want to set up data libraries (without copying to Galaxy).  This is
  obviously no problem the first time, however I was wondering if
  there was a way to make a library, added from a system path, be
  dynamic so that it would stay synchronized with the underlying file
  structure?

  __ __

  If a try dynamic library is not possible, is there a method for
  adding files to an existing library via that same system path that
  would not duplicate all of the original files in the data 
 library?

  __ __

  I did some scouring of the list and found some old unanswered
  questions and some things tangentially related topics, but I was
  unable to find a true answer or solution to my problem.  Any
  information on how to do the tasks above or other solutions to
  provide the same functionality would be greatly appreciated.

  __ __

  Thanks,

  Nathan

  __ __


  ___
  Please keep all replies on the list by using reply all
  in your mail client.  To manage your subscriptions to this
  and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

  To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/




 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this and other 
 Galaxy lists, please use the interface at:
 http://lists.bx.psu.edu/

 To search Galaxy mailing lists use the unified search at:
 http://galaxyproject.org/search/mailinglists/


___
Please keep all replies on the list by using reply all
in your mail client.  To manage your

Re: [galaxy-dev] Dynamic data library

2013-10-01 Thread Hans-Rudolf Hotz



On 10/01/2013 03:53 PM, Cole, Nathan (NIH/NCI) [C] wrote:

Thank you both for your responses.  I will be looking into both of these.

With regard to the from_file option to add the sample selection into the tool:  
I assume this means that the metadata and everything is loaded into galaxy at 
the time the tool is run.


This depends on how you write your tool. Do you just wanna read the ie 
fastq file or do you also wanna read the meta data. Also, how is the 
meta data accessible? eg. is it stored in a txt file at the same 
location as the fastq file?


 Does this create a copy of the loaded file or simply read it in 
place?  Also are there any efficiency issues created using this method, 
outside of the tool run time increase due to the load of the data taking 
place in-tool?


It should just read it in place

Hans-Rudolf



Thanks,
Nathan

-Original Message-
From: Hans-Rudolf Hotz [mailto:h...@fmi.ch]
Sent: Tuesday, October 01, 2013 4:07 AM
To: Cole, Nathan (NIH/NCI) [C]
Cc: Martin Čech; galaxy-dev@lists.bx.psu.edu
Subject: Re: [galaxy-dev] Dynamic data library

Hi Nathan


Do you have many tools working with those samples or just a few? If you only 
have a limited, predefined set of tools you might wanna consider adding the 
sample selection into the tool.

You can use the from_file, or from_data_table options to dynamically create 
sample selection list. You can even drill down a hierarchical list. Have a look 
at ~/tools/annotation_profiler/annotation_profiler.xml
which uses the file
~/tool-data/annotation_profiler_options.xml

All you need to do is keeping the file in sync with the directory structure of 
your samples directory


Regards, Hans-Rudolf






On 09/30/2013 09:48 PM, Martin Čech wrote:

Hi Nathan,

Dannon answered similar question few days ago:

 There's an import mechanism in libraries that'll allow you to simply
 link to the file on disk without copy/upload.  I believe the
 example_watch_folder.py sample script (in the distribution) does
 just this via the API, if you want an example.


This might be what you are looking for.

Martin


On Mon, Sep 30, 2013 at 2:43 PM, Cole, Nathan (NIH/NCI) [C]
nathan.c...@nih.gov mailto:nathan.c...@nih.gov wrote:

 Hello, we’ve set up a local Galaxy instance in our genotyping and
 next-gen sequencing lab with local Apache LDAP (AD) integration, NFS
 mounts to a large NAS, and cluster integration coming.  Do to the
 high volume of samples and staff that will be using the system, I
 want to set up data libraries (without copying to Galaxy).  This is
 obviously no problem the first time, however I was wondering if
 there was a way to make a library, added from a system path, be
 dynamic so that it would stay synchronized with the underlying file
 structure?

 __ __

 If a try dynamic library is not possible, is there a method for
 adding files to an existing library via that same system path that
 would not duplicate all of the original files in the data library?

 __ __

 I did some scouring of the list and found some old unanswered
 questions and some things tangentially related topics, but I was
 unable to find a true answer or solution to my problem.  Any
 information on how to do the tasks above or other solutions to
 provide the same functionality would be greatly appreciated.

 __ __

 Thanks,

 Nathan

 __ __


 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:
 http://lists.bx.psu.edu/

 To search Galaxy mailing lists use the unified search at:
 http://galaxyproject.org/search/mailinglists/




___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
http://galaxyproject.org/search/mailinglists/


___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
 http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
 http://galaxyproject.org/search/mailinglists/