[galaxy-dev] Export galaxy interval data to InterMine
Dear Galaxy developers, I'm trying to create a tool to export galaxy interval data to InterMine instances. Firstly, based on the tool to send dataset to EpiGRAPH, I created one to FlyMine, it's working on my local instance. Then I tried a bit complex one, to integrate multiple mines in one tool, source xml as following: ?xml version=1.0? tool name=Perform advanced genomic region search id=intermine_export description with an InterMine instance/description redirect_url_paramsGENOME=${input1.dbkey} NAME=${input1.name} INFO=${input1.info}/redirect_url_params inputs param format=bed name=input1 type=data label=Send this dataset to an InterMine instance validator type=unspecified_build / /param conditional name=destination param name=indexDestination type=select label=Select InterMine instance help=InterMine genomic region search service. option value=flymineFlyMine/option option value=modminemodMine/option option value=metabolicmineMetabolicMine/option /param when value=flymine param name=REDIRECT_URL type=hidden value=http://www.flymine.org/query/genomicRegionSearch.do; / /when when value=modmine param name=REDIRECT_URL type=hidden value=http://intermine.modencode.org/query/genomicRegionSearch.do; / /when when value=metabolicmine param name=REDIRECT_URL type=hidden value=http://www.metabolicmine.org/beta/query/genomicRegionSearch.do; / /when /conditional param name=DATA_URL type=baseurl value=/datasets / param name=GALAXY_URL type=baseurl value=/tool_runner?tool_id=intermine_import / /inputs outputs/ help some help text here... /help /tool I use conditional tag on intermine instances, it will be parsed to a dropdown list in html page, every time I select a different mine, I'd like the value of REDIRECT_URL param to change accordingly, but this is not working. After testing and reading the doc a bit, I realised that REDIRECT_URL will be renamed destination | REDIRECT_URL in the rendered html, and the xml parser will only recognise the key word REDIRECT_URL (reserved), is this the case? Can I still valid my logics in the xml? Thanks Fengyuan ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-dev] barchart not working: GNUplot problem
Hi all, I have got numpy and gnuplot on speaking terms. But now, trying to use the barchart, I get following error: gnuplot set style histogram clustered gap 5 title offset character 0, 0, 0 ^ line 0: expecting 'data', 'function', 'line', 'fill' or 'arrow' line 0: undefined variable: in line 0: undefined variable: invert gnuplot set style data histograms ^ line 0: expecting 'lines', 'points', 'linespoints', 'dots', 'impulses', 'yerrorbars', 'xerrorbars', 'xyerrorbars', 'steps', 'fsteps', 'histeps', 'filledcurves', 'boxes', 'boxerrorbars', 'boxxyerrorbars', 'vectors', 'financebars', 'candlesticks', 'errorlines', 'xerrorlines', 'yerrorlines', 'xyerrorlines', 'pm3d' Gnuplot is working from commandline and has version 4.0. Thanks for any suggestion! Joachim ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] cluster path question
Hello everyone - I am also running into this issue trying to get galaxy integrated with our sun grid engine. My galaxy user's .bash_profile does not appear to get sourced when the jobs run. I augmented the sample sam_filter.py tutorial such that it output path and user info so I could see how the jobs were being run: out = open( sys.argv[2], w ) out2 = open(/data/galaxy-dist/ann.out, w) out2.write(socket.gethostname()) out2.write(\n) out2.write(os.environ['PATH']) out2.write(\n) drmaa = os.environ.get('DRMAA_LIBRARY_PATH') if drmaa is None: out2.write(None) else: out2.write(os.environ.get('DRMAA_LIBRARY_PATH')) out2.write(\n) out2.write(str(os.geteuid())) out2.write(\n) out2.write(str(os.getegid())) shutil.copytree(/data/galaxy-dist/database/pbs,/data/galaxy-dist/ann) the job is being dispatched as my galaxy user, however the my augments to PATH and additional env vars that I have exported in our galaxy user's .bash_profile are not present when the script runs (ie, .bash_profile is not sourced). When I use qsub to manually run the galaxy script that gets generated under database/pbs, the output to ann.out reflects my PATH and exported env vars. Was there any other solution to this issue besides the drmaa.py script augment? Thanks much for your help, Ann ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] cluster path question
I figured out a solution. The sun grid engine will strip back the env of what gets passed along with the job submission. I added a native drmaa option, -V, which caused the env vars found on the shell that submits the job to be passed along. Therefore all the environment setup I did in my galaxy user's .bash_profile and thus configured in my local shell running galaxy now gets propagated with my job submissions. This does not allow changes to .bash_profile to be picked up dynamically, however, since the .bash_profile is not sourced on each compute node. IE changes made to the galaxy user's env needs to be re-sourced in the shell that runs galaxy and dispatches the jobs. Thanks - hope this helps others, Ann On Sep 12, 2011, at 10:09 AM, Ann Black wrote: Hello everyone - I am also running into this issue trying to get galaxy integrated with our sun grid engine. My galaxy user's .bash_profile does not appear to get sourced when the jobs run. I augmented the sample sam_filter.py tutorial such that it output path and user info so I could see how the jobs were being run: out = open( sys.argv[2], w ) out2 = open(/data/galaxy-dist/ann.out, w) out2.write(socket.gethostname()) out2.write(\n) out2.write(os.environ['PATH']) out2.write(\n) drmaa = os.environ.get('DRMAA_LIBRARY_PATH') if drmaa is None: out2.write(None) else: out2.write(os.environ.get('DRMAA_LIBRARY_PATH')) out2.write(\n) out2.write(str(os.geteuid())) out2.write(\n) out2.write(str(os.getegid())) shutil.copytree(/data/galaxy-dist/database/pbs,/data/galaxy-dist/ann) the job is being dispatched as my galaxy user, however the my augments to PATH and additional env vars that I have exported in our galaxy user's .bash_profile are not present when the script runs (ie, .bash_profile is not sourced). When I use qsub to manually run the galaxy script that gets generated under database/pbs, the output to ann.out reflects my PATH and exported env vars. Was there any other solution to this issue besides the drmaa.py script augment? Thanks much for your help, Ann ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-dev] (Composite) Dataset Upload not Setting Metadata
Hi everyone, I've been getting my feet wet with Galaxy development working to get some of the rexpression tools online, and I've run into a snag that I've traced back to a set_meta datatype method not being able to find a file from which it wants to extract metadata. After reading the code, I believe this would also be a problem for non-composite datatypes. The specific test case I've been looking at is uploading an affybatch file (and associated pheno file) using Galaxy's built-in upload tool and selecting the File Format manually (ie choosing affybatch in the dropdown). I am using unmodified datatype definitions provided in lib/galaxy/datatypes/genetics.py and unmodified core Galaxy upload code as of 5955:949e4f5fa03a. (I am also testing with modified versions, but I am able to reproduce and track this bug in the specified clean version). The crux of the cause of error is that in JobWrapper.finish(), dataset.set_meta() is called (lib/galaxy/jobs/__init__.py:607) before the composite dataset uploaded files are moved (in a call to a Tool method self.tool.collect_associated_files(out_data, self.working_directory) on line 670) from the job working directory to the final destination under config.file_path (which defaults to database/files). In my test case, database.set_meta( overwrite = False ) eventually calls lib/galaxy/datatypes/genetics.py:Rexp.set_meta(dataset, **kwd). As far as I can tell, the only ways to construct a path to a file (or the file) in a dataset without using hard-coded paths from external knowledge is to use the Dataset.get_file_name or Dataset.extra_files_path properties. Unless explicitly told otherwise, both of these methods construct a path based on the Dataset.file_path class data member, whose value is set during Galaxy startup to config.file_path (default database/files). However, at the time set_meta is called in this case, the files are not under config.file_path, but rather under the job working directory. Attempting to open files from the dataset therefore fails when using these paths. However, unless the job working directory is passed to set_meta or during construction of the underlying Dataset object, there doesn't appear to be a way for a Dataset method to access th! e currently running job (for instance to get its job ID or working directory). (The second suggestion is actually not possible; since the standard upload is asynchronous, the Dataset object is created (and persisted) before the Job that will process it is created.) Thoughts? This issue affects Rexp.set_peek also, as well as any other functions that may want to read data from the uploaded files before they are moved to permanent location. This is why if you have an affybatch file and its associated pheno file and you test this on, say, the public Galaxy server at http://main.g2.bx.psu.edu/ you'll see that the peek info says (for example): ##failed to find /galaxy/main_database/files/002/948/dataset_2948818_files/affybatch_test.pheno It seems that if the current way that Dataset.file_path, Dataset.file_name, and Dataset.extra_files_path is part of the desired design of Galaxy, that methods like set_meta should be run after the files have been moved to config.file_path so they can set metadata based on file contents. It looks like this is intended to happen at least in some cases, from looking at lib/galaxy/jobs/__init__.py:568-586. However, in my tests this code is not kicking in because hda_tool_output is None. Any clarification on what's happening here, what's supposed to be happening for setting metadata on (potentially composite) uploads, why dataset.set_meta() isn't already being called after the files are moved to config.file_path, or any insights on related Galaxy design decisions I may not know about or design constraints I may have missed would be very greatly appreciated. I'd also be glad to provide further detail or test files upon request. Thank you, Eric Paniagua PS: Further notes on passing the job working directory to set_meta or set_peek - I have been successful modifying the code to do this for set_meta since the call chain starting from dataset.set_meta() in JobWrapper.finish() to Rexp.set_meta() accepts and forwards keyword argument dictionaries along the way. However, set_peek does not accept arbitrary keyword arguments, making it harder to pass along the job working directory when needed without stepping on the toes of any other code. ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] cluster path question
Ann Black wrote: I figured out a solution. The sun grid engine will strip back the env of what gets passed along with the job submission. I added a native drmaa option, -V, which caused the env vars found on the shell that submits the job to be passed along. Therefore all the environment setup I did in my galaxy user's .bash_profile and thus configured in my local shell running galaxy now gets propagated with my job submissions. This does not allow changes to .bash_profile to be picked up dynamically, however, since the .bash_profile is not sourced on each compute node. IE changes made to the galaxy user's env needs to be re-sourced in the shell that runs galaxy and dispatches the jobs. Thanks - hope this helps others, Hi Ann, For SGE, you can also use ~/.sge_request to set up the environment on the execution host. --nate Ann On Sep 12, 2011, at 10:09 AM, Ann Black wrote: Hello everyone - I am also running into this issue trying to get galaxy integrated with our sun grid engine. My galaxy user's .bash_profile does not appear to get sourced when the jobs run. I augmented the sample sam_filter.py tutorial such that it output path and user info so I could see how the jobs were being run: out = open( sys.argv[2], w ) out2 = open(/data/galaxy-dist/ann.out, w) out2.write(socket.gethostname()) out2.write(\n) out2.write(os.environ['PATH']) out2.write(\n) drmaa = os.environ.get('DRMAA_LIBRARY_PATH') if drmaa is None: out2.write(None) else: out2.write(os.environ.get('DRMAA_LIBRARY_PATH')) out2.write(\n) out2.write(str(os.geteuid())) out2.write(\n) out2.write(str(os.getegid())) shutil.copytree(/data/galaxy-dist/database/pbs,/data/galaxy-dist/ann) the job is being dispatched as my galaxy user, however the my augments to PATH and additional env vars that I have exported in our galaxy user's .bash_profile are not present when the script runs (ie, .bash_profile is not sourced). When I use qsub to manually run the galaxy script that gets generated under database/pbs, the output to ann.out reflects my PATH and exported env vars. Was there any other solution to this issue besides the drmaa.py script augment? Thanks much for your help, Ann ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] cluster path question
Thanks Nate! I did not know about sge_request - and that helps. I played around with using -V and -v in sge_request some and it works similar to specifying -V (or -v) in my universe_wsgi.ini file. I am new to galaxy and I am working on getting it instantiated locally. Is there an advantage to setting my native DRMAA options in my galaxy user's local .sge_request file vs. directly in the universe_wsgi.ini configuration file? Would an advantage be that I can set more universal drmaa native options that I would like to common for all tools there (sge_request for galaxy user) vs. duplicating the in each per tool configurations in universe_wsgi file? Thanks again, Ann On Sep 12, 2011, at 12:50 PM, Nate Coraor wrote: Ann Black wrote: I figured out a solution. The sun grid engine will strip back the env of what gets passed along with the job submission. I added a native drmaa option, -V, which caused the env vars found on the shell that submits the job to be passed along. Therefore all the environment setup I did in my galaxy user's .bash_profile and thus configured in my local shell running galaxy now gets propagated with my job submissions. This does not allow changes to .bash_profile to be picked up dynamically, however, since the .bash_profile is not sourced on each compute node. IE changes made to the galaxy user's env needs to be re-sourced in the shell that runs galaxy and dispatches the jobs. Thanks - hope this helps others, Hi Ann, For SGE, you can also use ~/.sge_request to set up the environment on the execution host. --nate Ann On Sep 12, 2011, at 10:09 AM, Ann Black wrote: Hello everyone - I am also running into this issue trying to get galaxy integrated with our sun grid engine. My galaxy user's .bash_profile does not appear to get sourced when the jobs run. I augmented the sample sam_filter.py tutorial such that it output path and user info so I could see how the jobs were being run: out = open( sys.argv[2], w ) out2 = open(/data/galaxy-dist/ann.out, w) out2.write(socket.gethostname()) out2.write(\n) out2.write(os.environ['PATH']) out2.write(\n) drmaa = os.environ.get('DRMAA_LIBRARY_PATH') if drmaa is None: out2.write(None) else: out2.write(os.environ.get('DRMAA_LIBRARY_PATH')) out2.write(\n) out2.write(str(os.geteuid())) out2.write(\n) out2.write(str(os.getegid())) shutil.copytree(/data/galaxy-dist/database/pbs,/data/galaxy-dist/ann) the job is being dispatched as my galaxy user, however the my augments to PATH and additional env vars that I have exported in our galaxy user's .bash_profile are not present when the script runs (ie, .bash_profile is not sourced). When I use qsub to manually run the galaxy script that gets generated under database/pbs, the output to ann.out reflects my PATH and exported env vars. Was there any other solution to this issue besides the drmaa.py script augment? Thanks much for your help, Ann ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-dev] ProFTPd on Ubuntu system.
Hi, Has anyone set up ProFTPd successfully on Ubuntu 10.04 to enable FTP upload on Galaxy? I followed the instructions on this link ( http://wiki.g2.bx.psu.edu/Admin/Config/Upload%20via%20FTP) but it doesn't work. Thanks, Luobin ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/