Re: [galaxy-dev] Dynamic data library
On 10/01/2013 07:12 PM, Cole, Nathan (NIH/NCI) [C] wrote: One final question as I dive into looking at these two methods: can you expose whole hierarchies and directories using the from_file method or will this only work on an individual sample basis? yes, you can. as an example have a look at the affymetrix cel files we offer. See attachment for a screen shot and the coresponding fragment from the xml file: options option name=Affymetrix value=Affymetrix option name=HumanGeneST10_TissueData value=HumanGeneST10_TissueData option type=meta_key name=MouseTP_Brain_01_mGENE.CEL value=/***/***/external/Affymetrix/HumanGeneST10_TissueData/MouseTP_Brain_01_mGENE.CEL/ option type=meta_key name=MouseTP_Brain_02_mGENE.CEL value=/***/***/external/Affymetrix/HumanGeneST10_TissueData/MouseTP_Brain_02_mGENE.CEL/ option type=meta_key name=MouseTP_Brain_03_mGENE.CEL value=/***/***/external/Affymetrix/HumanGeneST10_TissueData/MouseTP_Brain_03_mGENE.CEL/ option type=meta_key name=MouseTP_Embryo_01_mGENE.CEL value=/***/***/external/Affymetrix/HumanGeneST10_TissueData/MouseTP_Embryo_01_mGENE.CEL/ option type=meta_key name=MouseTP_Embryo_02_mGENE.CEL value=/***/***/external/Affymetrix/HumanGeneST10_TissueData/MouseTP_Embryo_02_mGENE.CEL/ option type=meta_key name=MouseTP_Embryo_03_mGENE.CEL value=/***/***/external/Affymetrix/HumanGeneST10_TissueData/MouseTP_Embryo_03_mGENE.CEL/ option type=meta_key name=MouseTP_Heart_01_mGENE.CEL value=/***/***/external/Affymetrix/HumanGeneST10_TissueData/MouseTP_Heart_01_mGENE.CEL/ option type=meta_key name=MouseTP_Heart_02_mGENE.CEL value=/***/***/external/Affymetrix/HumanGeneST10_TissueData/MouseTP_Heart_02_mGENE.CEL/ option type=meta_key name=MouseTP_Heart_03_mGENE.CEL value=/***/***/external/Affymetrix/HumanGeneST10_TissueData/MouseTP_Heart_03_mGENE.CEL/ option type=meta_key name=MouseTP_Kidney_01_mGENE.CEL value=/***/***/external/Affymetrix/HumanGeneST10_TissueData/MouseTP_Kidney_01_mGENE.CEL/ option type=meta_key name=MouseTP_Kidney_02_mGENE.CEL value=/***/***/external/Affymetrix/HumanGeneST10_TissueData/MouseTP_Kidney_02_mGENE.CEL/ option type=meta_key name=MouseTP_Kidney_03_mGENE.CEL value=/***/***/external/Affymetrix/HumanGeneST10_TissueData/MouseTP_Kidney_03_mGENE.CEL/ // option type=meta_key name=MouseTP_Thymus_03_mGENE.CEL value=/***/***/external/Affymetrix/HumanGeneST10_TissueData/MouseTP_Thymus_03_mGENE.CEL/ /option option name=MouseGeneST10_TissueData value=MouseGeneST10_TissueData option type=meta_key name=MouseTP_Brain_01_mGENE.CEL value=/***/***/external/Affymetrix/MouseGeneST10_TissueData/MouseTP_Brain_01_mGENE.CEL/ option type=meta_key name=MouseTP_Brain_02_mGENE.CEL value=/***/***/external/Affymetrix/MouseGeneST10_TissueData/MouseTP_Brain_02_mGENE.CEL/ // option type=meta_key name=MouseTP_Thymus_03_mGENE.CEL value=/***/***/external/Affymetrix/MouseGeneST10_TissueData/MouseTP_Thymus_03_mGENE.CEL/ /option /option option name=GEO value=GEO // /option /options If not, is there any method for exposing a the whole of a directory on the file system? Thanks, Nathan -Original Message- From: Hans-Rudolf Hotz [mailto:h...@fmi.ch] Sent: Tuesday, October 01, 2013 10:11 AM To: Cole, Nathan (NIH/NCI) [C] Cc: 'galaxy-dev@lists.bx.psu.edu' Subject: Re: [galaxy-dev] Dynamic data library On 10/01/2013 03:53 PM, Cole, Nathan (NIH/NCI) [C] wrote: Thank you both for your responses. I will be looking into both of these. With regard to the from_file option to add the sample selection into the tool: I assume this means that the metadata and everything is loaded into galaxy at the time the tool is run. This depends on how you write your tool. Do you just wanna read the ie fastq file or do you also wanna read the meta data. Also, how is the meta data accessible? eg. is it stored in a txt file at the same location as the fastq file? Does this create a copy of the loaded file or simply read it in place? Also are there any efficiency issues created using this method, outside of the tool run time increase due to the load of the data taking place in-tool? It should just read it in place Hans-Rudolf Thanks, Nathan -Original Message- From: Hans-Rudolf Hotz [mailto:h...@fmi.ch] Sent: Tuesday, October 01, 2013 4:07 AM To: Cole, Nathan (NIH/NCI) [C] Cc: Martin Čech; galaxy-dev@lists.bx.psu.edu Subject: Re: [galaxy-dev] Dynamic data library Hi Nathan Do you have many tools working with those samples or just a few? If you only have a limited, predefined set of tools you might wanna consider adding the sample selection into the tool. You can use the from_file, or from_data_table options to dynamically create sample selection list. You can even drill down a hierarchical list. Have a look at ~/tools
Re: [galaxy-dev] Dynamic data library
One final question as I dive into looking at these two methods: can you expose whole hierarchies and directories using the from_file method or will this only work on an individual sample basis? If not, is there any method for exposing a the whole of a directory on the file system? Thanks, Nathan -Original Message- From: Hans-Rudolf Hotz [mailto:h...@fmi.ch] Sent: Tuesday, October 01, 2013 10:11 AM To: Cole, Nathan (NIH/NCI) [C] Cc: 'galaxy-dev@lists.bx.psu.edu' Subject: Re: [galaxy-dev] Dynamic data library On 10/01/2013 03:53 PM, Cole, Nathan (NIH/NCI) [C] wrote: Thank you both for your responses. I will be looking into both of these. With regard to the from_file option to add the sample selection into the tool: I assume this means that the metadata and everything is loaded into galaxy at the time the tool is run. This depends on how you write your tool. Do you just wanna read the ie fastq file or do you also wanna read the meta data. Also, how is the meta data accessible? eg. is it stored in a txt file at the same location as the fastq file? Does this create a copy of the loaded file or simply read it in place? Also are there any efficiency issues created using this method, outside of the tool run time increase due to the load of the data taking place in-tool? It should just read it in place Hans-Rudolf Thanks, Nathan -Original Message- From: Hans-Rudolf Hotz [mailto:h...@fmi.ch] Sent: Tuesday, October 01, 2013 4:07 AM To: Cole, Nathan (NIH/NCI) [C] Cc: Martin Čech; galaxy-dev@lists.bx.psu.edu Subject: Re: [galaxy-dev] Dynamic data library Hi Nathan Do you have many tools working with those samples or just a few? If you only have a limited, predefined set of tools you might wanna consider adding the sample selection into the tool. You can use the from_file, or from_data_table options to dynamically create sample selection list. You can even drill down a hierarchical list. Have a look at ~/tools/annotation_profiler/annotation_profiler.xml which uses the file ~/tool-data/annotation_profiler_options.xml All you need to do is keeping the file in sync with the directory structure of your samples directory Regards, Hans-Rudolf On 09/30/2013 09:48 PM, Martin Čech wrote: Hi Nathan, Dannon answered similar question few days ago: There's an import mechanism in libraries that'll allow you to simply link to the file on disk without copy/upload. I believe the example_watch_folder.py sample script (in the distribution) does just this via the API, if you want an example. This might be what you are looking for. Martin On Mon, Sep 30, 2013 at 2:43 PM, Cole, Nathan (NIH/NCI) [C] nathan.c...@nih.gov mailto:nathan.c...@nih.gov wrote: Hello, we’ve set up a local Galaxy instance in our genotyping and next-gen sequencing lab with local Apache LDAP (AD) integration, NFS mounts to a large NAS, and cluster integration coming. Do to the high volume of samples and staff that will be using the system, I want to set up data libraries (without copying to Galaxy). This is obviously no problem the first time, however I was wondering if there was a way to make a library, added from a system path, be dynamic so that it would stay synchronized with the underlying file structure? __ __ If a try dynamic library is not possible, is there a method for adding files to an existing library via that same system path that would not duplicate all of the original files in the data library? __ __ I did some scouring of the list and found some old unanswered questions and some things tangentially related topics, but I was unable to find a true answer or solution to my problem. Any information on how to do the tasks above or other solutions to provide the same functionality would be greatly appreciated. __ __ Thanks, Nathan __ __ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your
Re: [galaxy-dev] Dynamic data library
On 10/01/2013 03:53 PM, Cole, Nathan (NIH/NCI) [C] wrote: Thank you both for your responses. I will be looking into both of these. With regard to the from_file option to add the sample selection into the tool: I assume this means that the metadata and everything is loaded into galaxy at the time the tool is run. This depends on how you write your tool. Do you just wanna read the ie fastq file or do you also wanna read the meta data. Also, how is the meta data accessible? eg. is it stored in a txt file at the same location as the fastq file? Does this create a copy of the loaded file or simply read it in place? Also are there any efficiency issues created using this method, outside of the tool run time increase due to the load of the data taking place in-tool? It should just read it in place Hans-Rudolf Thanks, Nathan -Original Message- From: Hans-Rudolf Hotz [mailto:h...@fmi.ch] Sent: Tuesday, October 01, 2013 4:07 AM To: Cole, Nathan (NIH/NCI) [C] Cc: Martin Čech; galaxy-dev@lists.bx.psu.edu Subject: Re: [galaxy-dev] Dynamic data library Hi Nathan Do you have many tools working with those samples or just a few? If you only have a limited, predefined set of tools you might wanna consider adding the sample selection into the tool. You can use the from_file, or from_data_table options to dynamically create sample selection list. You can even drill down a hierarchical list. Have a look at ~/tools/annotation_profiler/annotation_profiler.xml which uses the file ~/tool-data/annotation_profiler_options.xml All you need to do is keeping the file in sync with the directory structure of your samples directory Regards, Hans-Rudolf On 09/30/2013 09:48 PM, Martin Čech wrote: Hi Nathan, Dannon answered similar question few days ago: There's an import mechanism in libraries that'll allow you to simply link to the file on disk without copy/upload. I believe the example_watch_folder.py sample script (in the distribution) does just this via the API, if you want an example. This might be what you are looking for. Martin On Mon, Sep 30, 2013 at 2:43 PM, Cole, Nathan (NIH/NCI) [C] nathan.c...@nih.gov mailto:nathan.c...@nih.gov wrote: Hello, we’ve set up a local Galaxy instance in our genotyping and next-gen sequencing lab with local Apache LDAP (AD) integration, NFS mounts to a large NAS, and cluster integration coming. Do to the high volume of samples and staff that will be using the system, I want to set up data libraries (without copying to Galaxy). This is obviously no problem the first time, however I was wondering if there was a way to make a library, added from a system path, be dynamic so that it would stay synchronized with the underlying file structure? __ __ If a try dynamic library is not possible, is there a method for adding files to an existing library via that same system path that would not duplicate all of the original files in the data library? __ __ I did some scouring of the list and found some old unanswered questions and some things tangentially related topics, but I was unable to find a true answer or solution to my problem. Any information on how to do the tasks above or other solutions to provide the same functionality would be greatly appreciated. __ __ Thanks, Nathan __ __ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/