Hi Luqman,
Take a look a my comments below.

On Fri, Mar 25, 2011 at 5:06 PM, Luqman Hodgkinson <luq...@berkeley.edu>wrote:

>
>
>
>
> Hi Enis,
>
> Thank you for your detailed reply. After playing with Galaxy, there are
> some questions I have.
>
> 1. All my Java classes are in the same project. There is only a single Main
> class. In order to use Galaxy, must each class be converted to a .jar file
> individually? If so, the disadvantage to this is then that only command-line
> parameters can be passed in. How does the data flow between the classes?
> Must each class read input from files and write input to files? That is,
> what is the nature of your type system for passing data between components.
> Can data be passed directly through RAM or must it go through the file
> system?
>
> If you are able to perform your complete analysis using a combination of
your Java classes by simply executing a single command (i.e, invoking a
single main class), the same can be achieved through Galaxy. You would
create a galaxy tool wrapper that allows Galaxy to invoke that same command
and the rest (i.e.., executing the tool) is going to be the same (i.e., the
same set of methods will get invoked and you should get the same output).
Once Galaxy invokes a tool, any form of data flow within that tool is up to
the tool itself so if the classes share data between each other, the same
will happen once they are invoked through Galaxy.


> 2. Provenance is very important for my workflow. The workflow will be run
> multiple times and a large number of versions will be created. These should
> be organized somewhere on the file system with timestamps and descriptions
> of the versions of the workflows that were used. How much support does
> Galaxy have for this?
>
> Galaxy keeps track of all the parameters and input data used to run a tool
or a workflow so, once the tool is integrated with Galaxy, keeping up with
the details is trivial - that's a big part of why Galaxy exists to begin
with.


3. Have you seen the new Conveyor paper?
> http://www.ncbi.nlm.nih.gov/pubmed?term=21278189 My requirements are very
> similar to those addressed in this paper. However, the current version of
> Conveyor does not seem very stable: I was even unable to get their graphical
> user interface running from their Java files. What are the capabilities of
> Galaxy for this use case?
>
> Unfortunately I have not read this paper yet so I cannot comment much.
If you have not yet, I would suggest you give Galaxy Main (usegalaxy.org) a
shot and try to run some jobs and create a few workflows to get an idea of
what can be done with tools once they are integrated with Galaxy as well as
the type of data and information Galaxy keeps. That should give you a good
indication of whether available functionality can be applied in your
scenario as well.

Enis



> Sincerely, with best wishes,
> Luqman
>
>
> On Mar 25, 2011, at 10:21 AM, Enis Afgan wrote:
>
> Hi Luqman,
> Were you planning on using Galaxy CloudMan (usegalaxy.org/cloud) and
> integrating your tool (i.e., Java classes) into the Galaxy that it deploys
> or simply starting a new EC2 instance and setting up a Galaxy instance from
> scratch?
> Either way, I would suggest trying the process out on your local system
> first. Adding new tools to Galaxy is pretty straightforward once you have
> the tool installed on the system, see
> https://bitbucket.org/galaxy/galaxy-central/wiki/AddToolTutorial. That
> will also allow you to test the overall functionality offered by Galaxy in
> the context of your own tool before trying to deploy the whole thing on the
> cloud.
>
> Once you transition to the cloud though, you would have to repeat the
> process of installing the tool on the created instance as you have done on
> the local system followed by copying the tool wrapper created to integrate
> it with Galaxy. If you started with a clean instance (i.e., not Galaxy
> CloudMan), after you've installed your tool and integrated it with Galaxy,
> you could simply use the AWS web console to create an AMI automatically.
> Then, you would start the newly created AMI, start Galaxy and start
> processing your data. Note that any data you upload to an instance will be
> lost once you terminate the instance though, unless you associate an EBS
> volume with it and have Galaxy store analysis data there (this is easily
> configured in Galaxy's universe_wsgi.ini file).
>
> Alternatively, you could use CloudMan and add your tool to the set of
> already existing tools as described here:
> https://bitbucket.org/galaxy/galaxy-central/wiki/Cloud/CustomizeGalaxyCloud
> If using CloudMan, all of the details regarding data persistance and Galaxy
> setup are automatically managed for you (excluding the addition of your own
> tool).
>
> Hope this helps,
> Enis
>
> On Mon, Mar 21, 2011 at 6:56 PM, Luqman Hodgkinson <luq...@berkeley.edu>wrote:
>
>>
>>
>>
>>
>> Dear Galaxy developers,
>> I have a collection of Java classes linked by a custom dataflow
>> architecture. All classes are in a single project but some of these classes
>> call executables written in languages other than Java. I am investigating
>> the possibility of transitioning to Galaxy. Essentially my desires are to
>> link these Java classes in a DAG representing the dataflow and to execute
>> the dataflow in Amazon EC2. The data flowing along the edges are arbitrary
>> custom Java classes. Additionally it is important to cache intermediate
>> results. The data is acquired from a few web services: iRefIndex, IntAct,
>> UniProt, and Gene Ontology. There are complex software dependencies so after
>> setting up the dataflow I would like to save the entire system as an
>> abstract machine image (AMI). How difficult would this transition be, and
>> would it be worth the effort?
>>                Sincerely, with best wishes,
>>                Luqman Hodgkinson,
>>                Ph.D. student, UC-Berkeley
>>
>>
>>
>>
>>
>> ___________________________________________________________
>> The Galaxy User list should be used for the discussion of
>> Galaxy analysis and other features on the public server
>> at usegalaxy.org.  Please keep all replies on the list by
>> using "reply all" in your mail client.  For discussion of
>> local Galaxy instances and the Galaxy source code, please
>> use the Galaxy Development list:
>>
>>  http://lists.bx.psu.edu/listinfo/galaxy-dev
>>
>> To manage your subscriptions to this and other Galaxy lists,
>> please use the interface at:
>>
>>  http://lists.bx.psu.edu/
>>
>
>
>
___________________________________________________________
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

Reply via email to