Re: [galaxy-dev] Dynamic data library

2013-10-02 Thread Cole, Nathan (NIH/NCI) [C]
One final question as I dive into looking at these two methods:  can you expose 
whole hierarchies and directories using the "from_file" method or will this 
only work on an individual sample basis?

If not, is there any method for exposing a the whole of a directory on the file 
system?

Thanks,
Nathan


-Original Message-
From: Hans-Rudolf Hotz [mailto:h...@fmi.ch] 
Sent: Tuesday, October 01, 2013 10:11 AM
To: Cole, Nathan (NIH/NCI) [C]
Cc: 'galaxy-dev@lists.bx.psu.edu'
Subject: Re: [galaxy-dev] Dynamic data library



On 10/01/2013 03:53 PM, Cole, Nathan (NIH/NCI) [C] wrote:
> Thank you both for your responses.  I will be looking into both of these.
>
> With regard to the from_file option to add the sample selection into the 
> tool:  I assume this means that the metadata and everything is loaded into 
> galaxy at the time the tool is run.

This depends on how you write your tool. Do you just wanna read the ie fastq 
file or do you also wanna read the meta data. Also, how is the meta data 
accessible? eg. is it stored in a txt file at the same location as the fastq 
file?

 > Does this create a copy of the loaded file or simply read it in place?  Also 
 > are there any efficiency issues created using this method, outside of the 
 > tool run time increase due to the load of the data taking place in-tool?

It should just read it in place

Hans-Rudolf

>
> Thanks,
> Nathan
>
> -Original Message-
> From: Hans-Rudolf Hotz [mailto:h...@fmi.ch]
> Sent: Tuesday, October 01, 2013 4:07 AM
> To: Cole, Nathan (NIH/NCI) [C]
> Cc: Martin Čech; galaxy-dev@lists.bx.psu.edu
> Subject: Re: [galaxy-dev] Dynamic data library
>
> Hi Nathan
>
>
> Do you have many tools working with those samples or just a few? If you only 
> have a limited, predefined set of tools you might wanna consider adding the 
> sample selection into the tool.
>
> You can use the from_file, or from_data_table options to dynamically 
> create sample selection list. You can even drill down a hierarchical 
> list. Have a look at 
> ~/tools/annotation_profiler/annotation_profiler.xml
> which uses the file
> ~/tool-data/annotation_profiler_options.xml
>
> All you need to do is keeping the file in sync with the directory 
> structure of your samples directory
>
>
> Regards, Hans-Rudolf
>
>
>
>
>
>
> On 09/30/2013 09:48 PM, Martin Čech wrote:
>> Hi Nathan,
>>
>> Dannon answered similar question few days ago:
>>
>>  There's an import mechanism in libraries that'll allow you to simply
>>  link to the file on disk without copy/upload.  I believe the
>>  "example_watch_folder.py" sample script (in the distribution) does
>>  just this via the API, if you want an example.
>>
>>
>> This might be what you are looking for.
>>
>> Martin
>>
>>
>> On Mon, Sep 30, 2013 at 2:43 PM, Cole, Nathan (NIH/NCI) [C] 
>> mailto:nathan.c...@nih.gov>> wrote:
>>
>>  Hello, we’ve set up a local Galaxy instance in our genotyping and
>>  next-gen sequencing lab with local Apache LDAP (AD) integration, NFS
>>  mounts to a large NAS, and cluster integration coming.  Do to the
>>  high volume of samples and staff that will be using the system, I
>>  want to set up data libraries (without copying to Galaxy).  This is
>>  obviously no problem the first time, however I was wondering if
>>  there was a way to make a library, added from a system path, be
>>  dynamic so that it would stay synchronized with the underlying file
>>  structure?
>>
>>  __ __
>>
>>  If a try dynamic library is not possible, is there a method for
>>  adding files to an existing library via that same system path that
>>  would not duplicate all of the original files in the data 
>> library?
>>
>>  __ __
>>
>>  I did some scouring of the list and found some old unanswered
>>  questions and some things tangentially related topics, but I was
>>  unable to find a true answer or solution to my problem.  Any
>>  information on how to do the tasks above or other solutions to
>>  provide the same functionality would be greatly appreciated.
>>
>>  __ __
>>
>>  Thanks,
>>
>>  Nathan
>>
>>  __ __
>>
>>
>>  ___
>>  Please keep all replies on the list by using "reply all"
>>  in your mail client.  To manage your subscriptions to this
>>  and other Galaxy lists, please use the interface at:
>>  

Re: [galaxy-dev] Dynamic data library

2013-10-02 Thread Hans-Rudolf Hotz



On 10/01/2013 07:12 PM, Cole, Nathan (NIH/NCI) [C] wrote:

One final question as I dive into looking at these two methods:  can you expose whole 
hierarchies and directories using the "from_file" method or will this only work 
on an individual sample basis?


yes, you can.
as an example have a look at the affymetrix cel files we offer. See 
attachment for a screen shot and the coresponding fragment from the xml 
file:


  
  
value="HumanGeneST10_TissueData">
  value="/***/***/external/Affymetrix/HumanGeneST10_TissueData/MouseTP_Brain_01_mGENE.CEL"/>
  value="/***/***/external/Affymetrix/HumanGeneST10_TissueData/MouseTP_Brain_02_mGENE.CEL"/>
  value="/***/***/external/Affymetrix/HumanGeneST10_TissueData/MouseTP_Brain_03_mGENE.CEL"/>
  value="/***/***/external/Affymetrix/HumanGeneST10_TissueData/MouseTP_Embryo_01_mGENE.CEL"/>
  value="/***/***/external/Affymetrix/HumanGeneST10_TissueData/MouseTP_Embryo_02_mGENE.CEL"/>
  value="/***/***/external/Affymetrix/HumanGeneST10_TissueData/MouseTP_Embryo_03_mGENE.CEL"/>
  value="/***/***/external/Affymetrix/HumanGeneST10_TissueData/MouseTP_Heart_01_mGENE.CEL"/>
  value="/***/***/external/Affymetrix/HumanGeneST10_TissueData/MouseTP_Heart_02_mGENE.CEL"/>
  value="/***/***/external/Affymetrix/HumanGeneST10_TissueData/MouseTP_Heart_03_mGENE.CEL"/>
  value="/***/***/external/Affymetrix/HumanGeneST10_TissueData/MouseTP_Kidney_01_mGENE.CEL"/>
  value="/***/***/external/Affymetrix/HumanGeneST10_TissueData/MouseTP_Kidney_02_mGENE.CEL"/>
  value="/***/***/external/Affymetrix/HumanGeneST10_TissueData/MouseTP_Kidney_03_mGENE.CEL"/>

//
  value="/***/***/external/Affymetrix/HumanGeneST10_TissueData/MouseTP_Thymus_03_mGENE.CEL"/>


value="MouseGeneST10_TissueData">
  value="/***/***/external/Affymetrix/MouseGeneST10_TissueData/MouseTP_Brain_01_mGENE.CEL"/>
  value="/***/***/external/Affymetrix/MouseGeneST10_TissueData/MouseTP_Brain_02_mGENE.CEL"/>

//
  value="/***/***/external/Affymetrix/MouseGeneST10_TissueData/MouseTP_Thymus_03_mGENE.CEL"/>


  
  
//
  






If not, is there any method for exposing a the whole of a directory on the file 
system?

Thanks,
Nathan


-Original Message-
From: Hans-Rudolf Hotz [mailto:h...@fmi.ch]
Sent: Tuesday, October 01, 2013 10:11 AM
To: Cole, Nathan (NIH/NCI) [C]
Cc: 'galaxy-dev@lists.bx.psu.edu'
Subject: Re: [galaxy-dev] Dynamic data library



On 10/01/2013 03:53 PM, Cole, Nathan (NIH/NCI) [C] wrote:

Thank you both for your responses.  I will be looking into both of these.

With regard to the from_file option to add the sample selection into the tool:  
I assume this means that the metadata and everything is loaded into galaxy at 
the time the tool is run.


This depends on how you write your tool. Do you just wanna read the ie fastq 
file or do you also wanna read the meta data. Also, how is the meta data 
accessible? eg. is it stored in a txt file at the same location as the fastq 
file?

  > Does this create a copy of the loaded file or simply read it in place?  
Also are there any efficiency issues created using this method, outside of the 
tool run time increase due to the load of the data taking place in-tool?

It should just read it in place

Hans-Rudolf



Thanks,
Nathan

-Original Message-
From: Hans-Rudolf Hotz [mailto:h...@fmi.ch]
Sent: Tuesday, October 01, 2013 4:07 AM
To: Cole, Nathan (NIH/NCI) [C]
Cc: Martin Čech; galaxy-dev@lists.bx.psu.edu
Subject: Re: [galaxy-dev] Dynamic data library

Hi Nathan


Do you have many tools working with those samples or just a few? If you only 
have a limited, predefined set of tools you might wanna consider adding the 
sample selection into the tool.

You can use the from_file, or from_data_table options to dynamically
create sample selection list. You can even drill down a hierarchical
list. Have a look at
~/tools/annotation_profiler/annotation_profiler.xml
which uses the file
~/tool-data/annotation_profiler_options.xml

All you need to do is keeping the file in sync with the directory
structure of your samples directory


Regards, Hans-Rudolf






On 09/30/2013 09:48 PM, Martin Čech wrote:

Hi Nathan,

Dannon answered similar question few days ago:

  There's an import mechanism in libraries that'll allow you to simply
  link to the file on disk without copy/upload.  I believe the
  "example_watch_folder.py" sample script (in the distribution) does
  just this via the API, if you want an example.


This might be what you are looking for.

Martin


On Mon, Sep 30, 2013 at 2:43 PM, Cole, Nathan (NIH/NCI) [C]

Re: [galaxy-dev] Dynamic data library

2013-10-01 Thread Hans-Rudolf Hotz



On 10/01/2013 03:53 PM, Cole, Nathan (NIH/NCI) [C] wrote:

Thank you both for your responses.  I will be looking into both of these.

With regard to the from_file option to add the sample selection into the tool:  
I assume this means that the metadata and everything is loaded into galaxy at 
the time the tool is run.


This depends on how you write your tool. Do you just wanna read the ie 
fastq file or do you also wanna read the meta data. Also, how is the 
meta data accessible? eg. is it stored in a txt file at the same 
location as the fastq file?


> Does this create a copy of the loaded file or simply read it in 
place?  Also are there any efficiency issues created using this method, 
outside of the tool run time increase due to the load of the data taking 
place in-tool?


It should just read it in place

Hans-Rudolf



Thanks,
Nathan

-Original Message-
From: Hans-Rudolf Hotz [mailto:h...@fmi.ch]
Sent: Tuesday, October 01, 2013 4:07 AM
To: Cole, Nathan (NIH/NCI) [C]
Cc: Martin Čech; galaxy-dev@lists.bx.psu.edu
Subject: Re: [galaxy-dev] Dynamic data library

Hi Nathan


Do you have many tools working with those samples or just a few? If you only 
have a limited, predefined set of tools you might wanna consider adding the 
sample selection into the tool.

You can use the from_file, or from_data_table options to dynamically create 
sample selection list. You can even drill down a hierarchical list. Have a look 
at ~/tools/annotation_profiler/annotation_profiler.xml
which uses the file
~/tool-data/annotation_profiler_options.xml

All you need to do is keeping the file in sync with the directory structure of 
your samples directory


Regards, Hans-Rudolf






On 09/30/2013 09:48 PM, Martin Čech wrote:

Hi Nathan,

Dannon answered similar question few days ago:

 There's an import mechanism in libraries that'll allow you to simply
 link to the file on disk without copy/upload.  I believe the
 "example_watch_folder.py" sample script (in the distribution) does
 just this via the API, if you want an example.


This might be what you are looking for.

Martin


On Mon, Sep 30, 2013 at 2:43 PM, Cole, Nathan (NIH/NCI) [C]
mailto:nathan.c...@nih.gov>> wrote:

 Hello, we’ve set up a local Galaxy instance in our genotyping and
 next-gen sequencing lab with local Apache LDAP (AD) integration, NFS
 mounts to a large NAS, and cluster integration coming.  Do to the
 high volume of samples and staff that will be using the system, I
 want to set up data libraries (without copying to Galaxy).  This is
 obviously no problem the first time, however I was wondering if
 there was a way to make a library, added from a system path, be
 dynamic so that it would stay synchronized with the underlying file
 structure?

 __ __

 If a try dynamic library is not possible, is there a method for
 adding files to an existing library via that same system path that
 would not duplicate all of the original files in the data library?

 __ __

 I did some scouring of the list and found some old unanswered
 questions and some things tangentially related topics, but I was
 unable to find a true answer or solution to my problem.  Any
 information on how to do the tasks above or other solutions to
 provide the same functionality would be greatly appreciated.

 __ __

 Thanks,

 Nathan

 __ __


 ___
 Please keep all replies on the list by using "reply all"
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:
 http://lists.bx.psu.edu/

 To search Galaxy mailing lists use the unified search at:
 http://galaxyproject.org/search/mailinglists/




___
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
http://galaxyproject.org/search/mailinglists/


___
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
 http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
 http://galaxyproject.org/search/mailinglists/

Re: [galaxy-dev] Dynamic data library

2013-10-01 Thread Hans-Rudolf Hotz

Hi Nathan


Do you have many tools working with those samples or just a few? If you 
only have a limited, predefined set of tools you might wanna consider 
adding the sample selection into the tool.


You can use the from_file, or from_data_table options to dynamically 
create sample selection list. You can even drill down a hierarchical 
list. Have a look at

~/tools/annotation_profiler/annotation_profiler.xml
which uses the file
~/tool-data/annotation_profiler_options.xml

All you need to do is keeping the file in sync with the directory 
structure of your samples directory



Regards, Hans-Rudolf






On 09/30/2013 09:48 PM, Martin Čech wrote:

Hi Nathan,

Dannon answered similar question few days ago:

There's an import mechanism in libraries that'll allow you to simply
link to the file on disk without copy/upload.  I believe the
"example_watch_folder.py" sample script (in the distribution) does
just this via the API, if you want an example.


This might be what you are looking for.

Martin


On Mon, Sep 30, 2013 at 2:43 PM, Cole, Nathan (NIH/NCI) [C]
mailto:nathan.c...@nih.gov>> wrote:

Hello, we’ve set up a local Galaxy instance in our genotyping and
next-gen sequencing lab with local Apache LDAP (AD) integration, NFS
mounts to a large NAS, and cluster integration coming.  Do to the
high volume of samples and staff that will be using the system, I
want to set up data libraries (without copying to Galaxy).  This is
obviously no problem the first time, however I was wondering if
there was a way to make a library, added from a system path, be
dynamic so that it would stay synchronized with the underlying file
structure?

__ __

If a try dynamic library is not possible, is there a method for
adding files to an existing library via that same system path that
would not duplicate all of the original files in the data library?

__ __

I did some scouring of the list and found some old unanswered
questions and some things tangentially related topics, but I was
unable to find a true answer or solution to my problem.  Any
information on how to do the tasks above or other solutions to
provide the same functionality would be greatly appreciated.

__ __

Thanks,

Nathan

__ __


___
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
http://galaxyproject.org/search/mailinglists/




___
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
   http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
   http://galaxyproject.org/search/mailinglists/


___
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
 http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
 http://galaxyproject.org/search/mailinglists/

Re: [galaxy-dev] Dynamic data library

2013-09-30 Thread Martin Čech
Hi Nathan,

Dannon answered similar question few days ago:

There's an import mechanism in libraries that'll allow you to simply link
> to the file on disk without copy/upload.  I believe the
> "example_watch_folder.py" sample script (in the distribution) does just
> this via the API, if you want an example.


This might be what you are looking for.

Martin


On Mon, Sep 30, 2013 at 2:43 PM, Cole, Nathan (NIH/NCI) [C] <
nathan.c...@nih.gov> wrote:

>  Hello, we’ve set up a local Galaxy instance in our genotyping and
> next-gen sequencing lab with local Apache LDAP (AD) integration, NFS mounts
> to a large NAS, and cluster integration coming.  Do to the high volume of
> samples and staff that will be using the system, I want to set up data
> libraries (without copying to Galaxy).  This is obviously no problem the
> first time, however I was wondering if there was a way to make a library,
> added from a system path, be dynamic so that it would stay synchronized
> with the underlying file structure?
>
> ** **
>
> If a try dynamic library is not possible, is there a method for adding
> files to an existing library via that same system path that would not
> duplicate all of the original files in the data library?
>
> ** **
>
> I did some scouring of the list and found some old unanswered questions
> and some things tangentially related topics, but I was unable to find a
> true answer or solution to my problem.  Any information on how to do the
> tasks above or other solutions to provide the same functionality would be
> greatly appreciated.
>
> ** **
>
> Thanks,
>
> Nathan
>
> ** **
>
> ___
> Please keep all replies on the list by using "reply all"
> in your mail client.  To manage your subscriptions to this
> and other Galaxy lists, please use the interface at:
>   http://lists.bx.psu.edu/
>
> To search Galaxy mailing lists use the unified search at:
>   http://galaxyproject.org/search/mailinglists/
>
___
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

[galaxy-dev] Dynamic data library

2013-09-30 Thread Cole, Nathan (NIH/NCI) [C]
Hello, we've set up a local Galaxy instance in our genotyping and next-gen 
sequencing lab with local Apache LDAP (AD) integration, NFS mounts to a large 
NAS, and cluster integration coming.  Do to the high volume of samples and 
staff that will be using the system, I want to set up data libraries (without 
copying to Galaxy).  This is obviously no problem the first time, however I was 
wondering if there was a way to make a library, added from a system path, be 
dynamic so that it would stay synchronized with the underlying file structure?

If a try dynamic library is not possible, is there a method for adding files to 
an existing library via that same system path that would not duplicate all of 
the original files in the data library?

I did some scouring of the list and found some old unanswered questions and 
some things tangentially related topics, but I was unable to find a true answer 
or solution to my problem.  Any information on how to do the tasks above or 
other solutions to provide the same functionality would be greatly appreciated.

Thanks,
Nathan

___
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/