Re: [galaxy-dev] HOW TO RETRIEVE DATA FROM HISTORY??!!

2011-09-02 Thread colin molter
On Thu, Aug 4, 2011 at 9:57 PM, colin molter colin.mol...@gmail.com wrote:
 Is there a way to directly move/copy data from your galaxy history to a
 given location in the filesystem of the same galaxy server?

2011/9/1 Edward Kirton eskir...@lbl.gov

 why not create a simple export tool?  perhaps with the option to cp
 or symlink.


This is exactly what I would like to have. I checked on Shed Tool. But it
seems that that tool doesn't exist yet. Before to try to make that tool, I
wanted to be sure that nobody had a similar tool for sharing.
thx
colin
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

[galaxy-dev] problem in workflow

2011-09-02 Thread shashi shekhar
Hi All,

 I am using local instance of galaxy,i am not able to use workflow .when i
am clicking on workflow , it' displaying the  loading picture only .
i am not able to see workflow editor. may it depends on browser?


Regards
shashi
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] HOW TO RETRIEVE DATA FROM HISTORY??!!

2011-09-02 Thread Steve Taylor

Hi,

We have written a tool that we call gls (galaxy ls). Running it is similar to doing 
ls -ltr in that it lists the histories in reverse chronological order and shows the 
actual path of the .dat file on the file system. You can then symlink/copy the actual files.

Example output:

2011-03-08 14:29:18 - Test 1
2011-03-08 14:29:56 - CXXC.bed 
/galaxy/database/files/001/dataset_1749.dat
2011-03-16 14:08:00 - BED-to-GFF on data 1 
/galaxy/database/files/001/dataset_1750.dat

2011-06-17 12:13:28 - Test 2
2011-06-17 12:14:24 - UCSC Main on Chicken: refGene (genome) 
/galaxy/database/files/003/dataset_3085.dat
2011-06-17 12:51:02 - UCSC Main on Chicken: refGene 
(chr2:57311158-57314247) /galaxy/database/files/003/dataset_3086.dat

2011-07-27 07:12:30 - Test 3
2011-07-27 07:15:44 - 
http://www.molbiol.ox.ac.uk/data/biopivot/example/small/example.gff3 
/galaxy/database/files/003/dataset_3296.dat
2011-07-27 07:16:27 - annotated gff3 on data 1 
/galaxy/database/files/003/dataset_3297.dat
2011-07-27 07:18:42 - UCSC Main on Human: eponine (genome) 
/galaxy/database/files/003/dataset_3298.dat
2011-07-27 07:19:38 - annotated overlap gff3 on data 3 and data 2 
/galaxy/database/files/003/dataset_3299.dat
2011-08-10 07:35:20 - SLX-3645.591.s_4_Input_AB_peaks.txt 
/galaxy/database/files/004/dataset_4086.dat
2011-08-10 07:55:48 - macs2gff3 on data 5 
/galaxy/database/files/004/dataset_4088.dat
2011-08-10 07:56:37 - annotated gff3 on data 7 
/galaxy/database/files/004/dataset_4089.dat


It's written in Perl, is run from the command line and accesses the galaxy 
database. We are happy to make this available if there is interest. Is the Tool 
Shed the best place to put it given it isn't a wrapper?

Regards,

Steve



On Thu, Aug 4, 2011 at 9:57 PM, colin molter colin.mol...@gmail.com
mailto:colin.mol...@gmail.com wrote:
  Is there a way to directly move/copy data from your galaxy history to a
  given location in the filesystem of the same galaxy server?

2011/9/1 Edward Kirton eskir...@lbl.gov mailto:eskir...@lbl.gov

why not create a simple export tool?  perhaps with the option to cp
or symlink.


This is exactly what I would like to have. I checked on Shed Tool. But
it seems that that tool doesn't exist yet. Before to try to make that
tool, I wanted to be sure that nobody had a similar tool for sharing.
thx
colin

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

 http://lists.bx.psu.edu/


[galaxy-dev] Problem with workflow

2011-09-02 Thread shashi shekhar
Hi,

i am not able to create workflow in my local instance of galaxy. it's
displaying only loading picture on browser. i am using old version version
galaxy.

Regards
shashi shekhar
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

[galaxy-dev] link file bug in the new version

2011-09-02 Thread remy d1
Hi,

We found a little problem in the new galaxy release. When we upload a
dataset library from filesystem :

Admin  Manage data libraries  create new data library  Add dataset 
Upload files from filesystem path  Link to files without copying

If galaxy user is the owner of this file or if he has write permission on it
(on the filesystem), the file is deleted !!


I do not think it is the normal behaviour...


Regards.
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] HOW TO RETRIEVE DATA FROM HISTORY??!!

2011-09-02 Thread Greg Von Kuster
Hello Steve,

Thanks for making this available.  If you could email me the script, I'll 
include it in the ~/contrib directory in the Galaxy distribution.

Greg Von Kuster

On Sep 2, 2011, at 3:20 AM, Steve Taylor wrote:

 Hi,
 
 We have written a tool that we call gls (galaxy ls). Running it is similar 
 to doing ls -ltr in that it lists the histories in reverse chronological 
 order and shows the actual path of the .dat file on the file system. You can 
 then symlink/copy the actual files.
 
 Example output:
 
 2011-03-08 14:29:18 - Test 1
2011-03-08 14:29:56 - CXXC.bed 
 /galaxy/database/files/001/dataset_1749.dat
2011-03-16 14:08:00 - BED-to-GFF on data 1 
 /galaxy/database/files/001/dataset_1750.dat
 
 2011-06-17 12:13:28 - Test 2
2011-06-17 12:14:24 - UCSC Main on Chicken: refGene (genome) 
 /galaxy/database/files/003/dataset_3085.dat
2011-06-17 12:51:02 - UCSC Main on Chicken: refGene 
 (chr2:57311158-57314247) /galaxy/database/files/003/dataset_3086.dat
 
 2011-07-27 07:12:30 - Test 3
2011-07-27 07:15:44 - 
 http://www.molbiol.ox.ac.uk/data/biopivot/example/small/example.gff3 
 /galaxy/database/files/003/dataset_3296.dat
2011-07-27 07:16:27 - annotated gff3 on data 1 
 /galaxy/database/files/003/dataset_3297.dat
2011-07-27 07:18:42 - UCSC Main on Human: eponine (genome) 
 /galaxy/database/files/003/dataset_3298.dat
2011-07-27 07:19:38 - annotated overlap gff3 on data 3 and data 2 
 /galaxy/database/files/003/dataset_3299.dat
2011-08-10 07:35:20 - SLX-3645.591.s_4_Input_AB_peaks.txt 
 /galaxy/database/files/004/dataset_4086.dat
2011-08-10 07:55:48 - macs2gff3 on data 5 
 /galaxy/database/files/004/dataset_4088.dat
2011-08-10 07:56:37 - annotated gff3 on data 7 
 /galaxy/database/files/004/dataset_4089.dat
 
 
 It's written in Perl, is run from the command line and accesses the galaxy 
 database. We are happy to make this available if there is interest. Is the 
 Tool Shed the best place to put it given it isn't a wrapper?
 
 Regards,
 
 Steve
 
 
 On Thu, Aug 4, 2011 at 9:57 PM, colin molter colin.mol...@gmail.com
 mailto:colin.mol...@gmail.com wrote:
  Is there a way to directly move/copy data from your galaxy history to a
  given location in the filesystem of the same galaxy server?
 
 2011/9/1 Edward Kirton eskir...@lbl.gov mailto:eskir...@lbl.gov
 
why not create a simple export tool?  perhaps with the option to cp
or symlink.
 
 
 This is exactly what I would like to have. I checked on Shed Tool. But
 it seems that that tool doesn't exist yet. Before to try to make that
 tool, I wanted to be sure that nobody had a similar tool for sharing.
 thx
 colin
 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:
 
 http://lists.bx.psu.edu/

Greg Von Kuster
Galaxy Development Team
g...@bx.psu.edu




___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


[galaxy-dev] Installation error

2011-09-02 Thread IT Support
Greetings,

I attempted to install Galaxy on a ROCKS 5.3 cluster. I checked out Galaxy to a 
location commonly accessible by all analysis nodes. I then set the path to 
Python 2.7 like so:

export PATH=/opt/galaxy-python/python:$PATH

However, executing run.sh gives me the error at the end of this message. The 
strange thing is that Galaxy runs fine on a CentOS 5.6 virtual machine.

Any help would be much appreciated. Thanks! 

-

Traceback (most recent call last):
  File ./scripts/paster.py, line 34, in ?
command.run()
  File 
/opt/galaxy-dist/eggs/PasteScript-1.7.3-py2.4.egg/paste/script/command.py, 
line 84, in run
invoke(command, command_name, options, args[1:])
  File 
/opt/galaxy-dist/eggs/PasteScript-1.7.3-py2.4.egg/paste/script/command.py, 
line 123, in invoke
exit_code = runner.run(args)
  File 
/opt/galaxy-dist/eggs/PasteScript-1.7.3-py2.4.egg/paste/script/command.py, 
line 218, in run
result = self.command()
  File 
/opt/galaxy-dist/eggs/PasteScript-1.7.3-py2.4.egg/paste/script/serve.py, line 
276, in command
relative_to=base, global_conf=vars)
  File 
/opt/galaxy-dist/eggs/PasteScript-1.7.3-py2.4.egg/paste/script/serve.py, line 
311, in loadapp
return loadapp(
  File 
/opt/galaxy-dist/eggs/PasteDeploy-1.3.3-py2.4.egg/paste/deploy/loadwsgi.py, 
line 204, in loadapp
return loadobj(APP, uri, name=name, **kw)
  File 
/opt/galaxy-dist/eggs/PasteDeploy-1.3.3-py2.4.egg/paste/deploy/loadwsgi.py, 
line 225, in loadobj
return context.create()
  File 
/opt/galaxy-dist/eggs/PasteDeploy-1.3.3-py2.4.egg/paste/deploy/loadwsgi.py, 
line 625, in create
return self.object_type.invoke(self)
  File 
/opt/galaxy-dist/eggs/PasteDeploy-1.3.3-py2.4.egg/paste/deploy/loadwsgi.py, 
line 110, in invoke
return fix_call(context.object, context.global_conf, **context.local_conf)
  File 
/opt/galaxy-dist/eggs/PasteDeploy-1.3.3-py2.4.egg/paste/deploy/util/fixtypeerror.py,
 line 57, in fix_call
val = callable(*args, **kw)
  File /opt/galaxy-dist/lib/galaxy/web/buildapp.py, line 90, in app_factory
add_controllers( webapp, app )
  File /opt/galaxy-dist/lib/galaxy/web/buildapp.py, line 39, in 
add_controllers
module = __import__( module_name )
  File /opt/galaxy-dist/lib/galaxy/web/controllers/admin.py, line 310
link=( lambda item: (dict( operation=Manage users and groups, id=item.id, 
webapp=galaxy ) if not item.default else dict( operation=Change amount, 
id=item.id, webapp=galaxy )) ),

-



___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-dev] contribute tools to galaxy?

2011-09-02 Thread Dongjun Chung

Thanks a lot! It's really helpful!
I'm looking forward to seeing new version of galaxy / galaxy tool shed soon!

Dongjun

On 8/31/2011 2:38 PM, Greg Von Kuster wrote:

Hello Dongjun, see my answers to your question inline.

On Aug 28, 2011, at 10:12 AM, Dongjun Chung wrote:


Hi All,

I'm a newbie to galaxy and enjoying it a lot these days. Thanks for 
the great work.


I have a question regarding contribution of software to galaxy. We 
developed a ChIP-seq peak calling algorithm and software (R package) 
and hope to contribute it to galaxy. I have read the wiki and prior 
mailing list about the contribution system but it is still somewhat 
confusing to me.


1. It seems that I can contribute our software to tool shed if I 
prepare appropriate code  definition files. Then, users can download 
and use it with their locally installed galaxy. However, these files 
committed to tool shed will not appear in galaxy main or test 
servers. Am I correct?


Tools from the Galaxy tool shed may or may not be available in the 
Galaxy test / main servers hosted at Penn State.




2. What is clear relationship between main/test servers  tool shed?


The Galaxy tool shed enables the Galaxy community to share tools. 
 These tools are generally used in local Galaxy installations, but not 
available on the Penn State instances.


Can we contribute our software to main or test servers as well? 


The tools available on the Penn State servers are generally developed 
by the Galaxy core development team, and are available in the Galaxy 
distribution.  If your tools complement the tools currently available 
in the distribution, the Galaxy core development team may agree to 
include them.  However, see my comments 2 answers below regarding 
upcoming enhancements to Galaxy and the tool shed.


Or only galaxy core developers can add new tools to galaxy main or 
test servers? 


Yes.

If so, which software is considered to be added to servers? Are they 
chosen from tools contributed to tool shed?


Currently, tools available in the distribution and on the Penn State 
instances are generally developed by the core Galaxy development team. 
 However, upcoming enhancements to Galaxy and the Galaxy tool shed 
will enable automatic installation of tools from the tool shed into 
local Galaxy instances, eliminating the necessity to include tools in 
the distribution.  This automation will be available fairly soon.




3. If our software is a R package, then users need to download and 
install it first in their R system before they use our software 
within their galaxy system even in the case they have appropriate 
definition files. Am I right? Or is there any better solution for this?


If your tools require R, then those that use your tools will need to 
install R in the Galaxy environment path so the tools will be 
functional.  Tools that have dependencies like this should include a 
requirements tag set in the tool config.


See 
http://wiki.g2.bx.psu.edu/Admin/Tools/Tool%20Config%20Syntax#A.3Crequirements.3E_tag_set




Thanks!

Best,
Dongjun ___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

http://lists.bx.psu.edu/


Greg Von Kuster
Galaxy Development Team
g...@bx.psu.edu mailto:g...@bx.psu.edu





___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] Suggestion / Request for Comments on Galaxy Best Practices - Gradual migration to standard indentation

2011-09-02 Thread Nate Coraor
Peter Cock wrote:
 On Thu, Sep 1, 2011 at 9:00 PM, Trevor Wennblom tre...@well.com wrote:
  ...
  given that python has syntactically significant whitespace, i also
  try to maintain the convention of indentation with four-spaces.
  i've noticed this isn't consistent within the codebase, but does
   seem to be the preferred style such as in `lib/galaxy/datatypes/`.
 
  python comes packaged with the script `reindent.py`:
 
  ...
 
  this is recommended practice per PEP 8:
   http://www.python.org/dev/peps/pep-0008/
 
  ...
 
  would anyone be opposed to me fixing up the current codebase
  to adhere to this? running `reindent.py` on the files is easy enough,
  i'm willing to step through the files (`opendiff` / `FileMerge.app`)
  and verify no unlikely syntactic changes have occurred. i can also
  deliver changes in gradual chunked pull requests to ease
  current developers getting possibly bit by merge issues.
 
 +1 on correcting any tabs to spaces in the Galaxy Python code.
 Doing this in chunked commits makes good sense too - although
 if you can get one of the Galaxy team to do this directly it might
 be quicker.

It is the Galaxy Team's intent to use four-space indents, anything else
is a mistake and we try to fix 'em as we see 'em.  Nobody has yet taken
the time to fix them all (i.e. with reindent.py) but such fixes would be
welcome.

 Personally I'd like to go further and fix the non-PEP8 white
 space in most of the Galaxy Python code, e.g.
 
 function ( argument )
 
 rather than:
 
 function(argument).

There are a lot of parts of PEP 8 that I doubt we want to adhere to
strictly.  I agree it's annoying to find varying styles, but the space
inside parentheses is not one I get too worked up about.

FWIW, I prefer function( argument ) and I tend to see that style out of
most of the rest of the team.

 Chatting to some of the Galaxy team at BOSC/ISMB 2011
 there is some support for this internally. Again, there are
 automated tools to do this.
 
  would anyone be willing to add the appropriate hooks to
  the central repository as well?
 
 As long as there are no false positives identified during
 the initial tab/space conversion that seems sensible to
 prevent new tabs creeping in. But not essential.

We can't add custom hooks to the bitbucket repository and we don't have
an intermediate local repository that we all push changesets through.

--nate

 
 Peter
 
 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:
 
   http://lists.bx.psu.edu/
 
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-dev] Suggestion / Request for Comments on Galaxy Best Practices - Gradual migration to standard indentation

2011-09-02 Thread Fields, Christopher J
On Sep 2, 2011, at 9:57 AM, Nate Coraor wrote:

 ...
 Chatting to some of the Galaxy team at BOSC/ISMB 2011
 there is some support for this internally. Again, there are
 automated tools to do this.
 
 would anyone be willing to add the appropriate hooks to
 the central repository as well?
 
 As long as there are no false positives identified during
 the initial tab/space conversion that seems sensible to
 prevent new tabs creeping in. But not essential.
 
 We can't add custom hooks to the bitbucket repository and we don't have
 an intermediate local repository that we all push changesets through.
 
 --nate

The same applies for most public repos like bitbucket; github is the same (only 
post-receive is allowed there).  I'm unsure whether repository hooks are stored 
on github, but they are definitely ignored (only specific post-receive hooks 
are allowed, and these are set up via the github admin API).

With git the only way I can think of to set something like this up is 
client-side (pre/post-commit, or pre-push), then maybe have a separate 
post-checkout hook to set everything up after a clone.  I assume hg has a 
similar mechanism.

chris


___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


[galaxy-dev] Galaxy egg fetching error? Mac OS X 10.7 (Lion)

2011-09-02 Thread Whyte, Jeffrey
Hi,

I've been having trouble running a local instance of Galaxy on a Mac Pro after 
upgrading to OS X 10.7 (Lion).  My Python version is 2.7.1 and Mercurial is 
1.9.1 for MacOS X 10.7.  I don't have any MacPorts installed.

The error I see after running the startup script is pasted at the end of this 
message.  Thanks in advance for any help or advice.
jjw

-

[~/galaxy-dist] myuserid 10:11 AM  ./run.sh
Some eggs are out of date, attempting to fetch...
Warning: MarkupSafe (a dependent egg of Mako) cannot be fetched
Warning: decorator (a dependent egg of sqlalchemy-migrate) cannot be fetched
Warning: simplejson (a dependent egg of WebHelpers) cannot be fetched
Traceback (most recent call last):
 File ./scripts/fetch_eggs.py, line 30, in module
   c.resolve() # Only fetch eggs required by the config
 File /Users/myuserid/galaxy-dist/lib/galaxy/eggs/__init__.py, line 345, in 
resolve
   egg.resolve()
 File /Users/myuserid/galaxy-dist/lib/galaxy/eggs/__init__.py, line 195, in 
resolve
   return self.version_conflict( e.args[0], e.args[1] )
 File /Users/myuserid/galaxy-dist/lib/galaxy/eggs/__init__.py, line 226, in 
version_conflict
   r = pkg_resources.working_set.resolve( ( dist.as_requirement(), ), env, 
egg.fetch )
 File 
/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/pkg_resources.py,
 line 565, in resolve
   raise DistributionNotFound(req)  # XXX put more info here
pkg_resources.DistributionNotFound: numpy==1.6.0
Fetch failed.
[~/galaxy-dist] myuserid 10:12 AM 



___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-dev] rpy - No module named rpy CentoOs install

2011-09-02 Thread Nate Coraor
Joseph Hargitai wrote:
 additional info:
 
 it is possible on the same node to run manually
 
 ./gsummary.py
 
 with the header:
 
 #!/usr/bin/env python
 
 import sys, re, tempfile
 from rpy_options import set_options
 set_options(RHOME='/apps1/R/2.13.1/intel/lib64/R')
 from rpy import *
 
 Where else can there be an env setting to prevent this app not finding the 
 mod from within galaxy?

Hi Joe,

If you set RHOME in the environment and then run gsummary.py without the
additions, does it work?

--nate

 
 j
 
 
 
 From: Joseph Hargitai
 Sent: Thursday, September 01, 2011 12:28 PM
 To: galaxy-dev@lists.bx.psu.edu
 Subject: rpy - No module named rpy CentoOs install
 
 Hi,
 
 On our Ubuntu install stat packages and all that require rpy work fine.
 
 On our CentOs install seeing this stubborn error that I did see from previous 
 post to be difficult to fix.
 
 At first suspected the SGE issue - environment not transferring to  compute 
 nodes. After changing the app to run local had the same issue.
 
 CentOs: 2.6.18-92.1.13.el5
 
 rpy module is in:
 
 /apps1/python/2.6.6/intel/lib/python2.6/site-packages
 
 _rpy2122.so
 _rpy2131.so
 
 version:
 [galaxy@compute-0-65 galaxy-dist]$ python -c import rpy; print 
 rpy.__version__
 1.5.1
 
 path:
 
 python -c 'import sys; print \n.join( sys.path )'
 
 /apps1/python/2.6.6/intel/lib/python2.6/site-packages/simplejson-2.0.9-py2.6-linux-x86_64.egg
 /apps1/python/2.6.6/intel/lib/python2.6/site-packages/Sphinx-1.0.7-py2.6.egg
 /apps1/python/2.6.6/intel/lib/python2.6/site-packages/docutils-0.7-py2.6.egg
 /apps1/python/2.6.6/intel/lib/python2.6/site-packages/Jinja2-2.5.5-py2.6.egg
 /apps1/python/2.6.6/intel/lib/python2.6/site-packages/Pygments-1.4-py2.6.egg
 /apps1/python/2.6.6/intel/lib/python2.6/site-packages/nose-1.0.0-py2.6.egg
 /apps1/python/2.6.6/intel/lib/python2.6/site-packages/Traits-3.5.0-py2.6-linux-x86_64.egg
 /apps1/python/2.6.6/intel/lib/python2.6/site-packages/nibabel-1.0.0-py2.6.egg
 /apps1/python/2.6.6/intel/lib/python2.6/site-packages/nipype-0.0.0-py2.6.egg
 /apps1/python/2.6.6/intel/lib/python2.6/site-packages/setuptools-0.6c12dev_r88846-py2.6.egg
 /apps1/python/2.6.6/intel/lib/python2.6/site-packages/birdsuite-1.0-py2.5.egg
 /apps1/python/2.6.6/intel/lib/python2.6/site-packages/mpgutils-0.7-py2.5.egg
 /apps1/python/2.6.6/intel/lib/python26.zip
 /apps1/python/2.6.6/intel/lib/python2.6
 /apps1/python/2.6.6/intel/lib/python2.6/plat-linux2
 /apps1/python/2.6.6/intel/lib/python2.6/lib-tk
 /apps1/python/2.6.6/intel/lib/python2.6/lib-old
 /apps1/python/2.6.6/intel/lib/python2.6/lib-dynload
 /apps1/python/2.6.6/intel/lib/python2.6/site-packages
 /apps1/python/2.6.6/intel/lib/python2.6/site-packages/PIL
 
 
 compiled against /R/2.13.1
 
 
 env:
 
 export PATH=.:\
 /apps1/R/2.13.1/intel/bin:\
 /apps1/python/2.6.6/intel/bin:\
 /apps1/pipe/bowtie/0.12.7/intel:\
 /apps1/pipe/bwa/0.5.9/intel:\
 /apps1/samtools/0.1.13/intel/bin:\
 /apps1/fastx_toolkit/0.0.13/intel/bin:\
 /apps1/maq/maq-0.7.1:\
 /apps1/maq/maq-0.7.1/scripts:\
 /apps1/bfast/bfast-0.6.5a/butil:\
 /apps1/bfast/bfast-0.6.5a/scripts:\
 /apps1/abyss/1.2.7/intel/bin:\
 /apps1/velvet/velvet_1.0.12:\
 /apps1/pipe/tophat/1.3.0/intel/bin:\
 /apps1/pipe/cufflinks/1.0.3/intel/bin:\
 /apps1/blast/2.2.25/gnu/bin:\
 /apps1/blast+/2.2.5/gnu/bin:\
 /apps1/sputnik/intel/bin:\
 /apps1/taxonomy/intel/bin:\
 /apps1/add_scores/add_scores:\
 /apps1/emboss/6.4.0/intel/bin:\
 /apps1/hyphy/hyphy/HYPHY:\
 /apps1/lastz/1.02.00:\
 /apps1/perm/0.3.6/intel/bin:\
 /apps1/beam2/intel/bin:\
 /apps1/pass2/intel/bin:\
 /apps1/plink/1.07/intel/bin:\
 /apps1/fbat/2.0.3/bin:\
 /apps1/eigensoft/3.0/intel/bin:\
 /apps1/mosaik/Mosaik-1.1.0021-Linux-x64/bin:\
 /apps1/freebayes/freebayes.git/bin:\
 $PATH
 
 export LD_LIBRARY_PATH=.:\
 /apps1/python/2.6.6/intel/lib:\
 /apps1/libgtextutils/0.6/intel/lib:\
 /apps1/emboss/6.4.0/intel/lib:\
 /apps1/intel/lib/intel64:\
 /apps1/intel/mkl/lib/em64t:\
 /apps1/tcltk/8.5.9/intel/lib:\
 /apps1/zlib/1.2.5/intel/lib:\
 /apps1/graphviz/2.26.3/intel/lib:\
 /apps1/python/2.6.6/intel/lib/python2.6/site-packages/simtk/chem/openmm/OpenMM:\
 /apps1/python/2.6.6/intel/lib/python2.6/site-packages:\
 /apps1/libpng/1.5.0/intel/lib:\
 /apps1/R/2.13.1/intel/lib64/R/lib:\
 $LD_LIBRARY_PATH
 
 export PKG_CONFIG_PATH=.:\
 /apps1/R/2.13.1/intel/lib64/pkgconfig:\
 /apps1/libgtextutils/0.6/intel/lib/pkgconfig:\
 /apps1/sparsehash/1.11/intel/lib/pkgconfig:\
 $PKG_CONFIG_PATH
 
 export CLASSPATH=.:\
 /apps1/gatk/gatk-git/dist:\
 /apps1/gatk/gatk-git/lib:\
 /apps1/srma/srma-0.1.13:\
 /apps1/haploview/4.2:\
 /apps1/picard/picard-tools-1.50:\
 /apps1/fastqc/fastqc-0.9.5:\
 $CLASSPATH
 
 
 best,
 joe
 
 

 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:
 
   http://lists.bx.psu.edu/


Re: [galaxy-dev] Modifying OpenID providers

2011-09-02 Thread Nate Coraor
Hi Nikolai,

It's best to send questions directly to the mailing list, so they reach
the widest audience and the right people.

There are some responses below:

Nikolai Vazov wrote:
 
 Hi, Nate,
 
 You helped me install Galaxy with a DB hosted on a remote server via
 SSL connection. Thanks a lot again for your help.
 
 I have been struggling to add OpenID providers to Galaxy, but have
 not been very successful. I have some questions:
 
 1) Where are the variables (OpenID providers) in the dropdown menu
 stocked? In a file, the DB? The template (login.mako) has a line
 
 
 %def name=render_openid_form( referer, auto_associate,
 openid_providers )
 
 but the where are the date for openid_providers (I am a newby in
 python ...)
 
 2) In which files do you configure them?

The list is hardcoded in:

lib/galaxy/web/controllers/user.py

These should ultimately be moved to a configuration file.

 3) The provider I want to add uses SAML 2.0, which is the package I
 need for python? pysaml2.0?

I haven't worked with SAML before so unfortunately I can't be of much
help here.

--nate

 
 Thank you for your help
 
 Nikolai
 
 -- 
 Nikolay Vazov, PhD
 Research Computing Centre - http://hpc.uio.no
 USIT, University of Oslo
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-dev] link file bug in the new version

2011-09-02 Thread Nate Coraor
remy d1 wrote:
 Hi,
 
 We found a little problem in the new galaxy release. When we upload a
 dataset library from filesystem :
 
 Admin  Manage data libraries  create new data library  Add dataset 
 Upload files from filesystem path  Link to files without copying
 
 If galaxy user is the owner of this file or if he has write permission on it
 (on the filesystem), the file is deleted !!
 
 
 I do not think it is the normal behaviour...

Hi Remy,

I'm unable to duplicate this behavior.  Would it be possible for you to
do some debugging on your end to determine when this is happening?  If
it's Galaxy's code it would probably be somewhere in
tools/data_source/upload.py

--nate

 
 
 Regards.

 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:
 
   http://lists.bx.psu.edu/

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-dev] Galaxy egg fetching error? Mac OS X 10.7 (Lion)

2011-09-02 Thread Nate Coraor
Whyte, Jeffrey wrote:
 Hi,
 
 I've been having trouble running a local instance of Galaxy on a Mac Pro 
 after upgrading to OS X 10.7 (Lion).  My Python version is 2.7.1 and 
 Mercurial is 1.9.1 for MacOS X 10.7.  I don't have any MacPorts installed.
 
 The error I see after running the startup script is pasted at the end of this 
 message.  Thanks in advance for any help or advice.
 jjw

Hi Jeffrey,

We haven't yet gotten our dependencies up to speed on Lion.  Could you
grab a copy of Python from python.org and use this?  It'll install under
/Library/Frameworks/Python.framework, just add the correct bin/
directory to the front of your $PATH and start Galaxy as normal.

Sorry for the inconvenience,
--nate

 
 -
 
 [~/galaxy-dist] myuserid 10:11 AM  ./run.sh
 Some eggs are out of date, attempting to fetch...
 Warning: MarkupSafe (a dependent egg of Mako) cannot be fetched
 Warning: decorator (a dependent egg of sqlalchemy-migrate) cannot be fetched
 Warning: simplejson (a dependent egg of WebHelpers) cannot be fetched
 Traceback (most recent call last):
  File ./scripts/fetch_eggs.py, line 30, in module
c.resolve() # Only fetch eggs required by the config
  File /Users/myuserid/galaxy-dist/lib/galaxy/eggs/__init__.py, line 345, in 
 resolve
egg.resolve()
  File /Users/myuserid/galaxy-dist/lib/galaxy/eggs/__init__.py, line 195, in 
 resolve
return self.version_conflict( e.args[0], e.args[1] )
  File /Users/myuserid/galaxy-dist/lib/galaxy/eggs/__init__.py, line 226, in 
 version_conflict
r = pkg_resources.working_set.resolve( ( dist.as_requirement(), ), env, 
 egg.fetch )
  File 
 /System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/pkg_resources.py,
  line 565, in resolve
raise DistributionNotFound(req)  # XXX put more info here
 pkg_resources.DistributionNotFound: numpy==1.6.0
 Fetch failed.
 [~/galaxy-dist] myuserid 10:12 AM 
 
 
 
 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:
 
   http://lists.bx.psu.edu/
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-dev] Galaxy egg fetching error? Mac OS X 10.7 (Lion)

2011-09-02 Thread Trevor Wennblom

On Sep 2, 2011, at 1:16 PM, Nate Coraor wrote:

 We haven't yet gotten our dependencies up to speed on Lion.  Could you
 grab a copy of Python from python.org and use this?  It'll install under
 /Library/Frameworks/Python.framework, just add the correct bin/
 directory to the front of your $PATH and start Galaxy as normal.

i'm assuming this is similar? 
https://bitbucket.org/galaxy/galaxy-central/issue/616/bad-eggs-for-106
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


[galaxy-dev] handling galaxy updates

2011-09-02 Thread Shantanu Pavgi

Hi,

I am curious to know which revision of galaxy code gets deployed in the main ( 
http://main.g2.bx.psu.edu/ ) and test ( http://test.g2.bx.psu.edu/  ) instances 
of galaxy. I was thinking active development repository galaxy-central code 
gets deployed in the test galaxy instance and stable galaxy-dist code gets 
deployed in the main galaxy instance.  However, it seems like main galaxy 
instance is updated more frequently than galaxy-dist repository. Should local 
galaxy instances keep up with PSU's main galaxy instance revisions or wait for 
stable code to be released in galaxy-dist repository? We have been keeping up 
with galaxy-dist repository and not the galaxy-central repository updates. Are  
other sites following similar update model? 

--
Thanks,
Shantanu. 
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-dev] rpy - No module named rpy CentoOs install

2011-09-02 Thread Joseph Hargitai

Nate, 

could we go to the beginning of the issue:

where is the galaxy env set?  I've seen a few post but I can only gather 
partial info.

- it is NOT set from the galaxy user .bashrc or .profile
- if it is indeed partially set from /etc/profile - using a Rocks cluster 
leaves you with many entries there to ponder
- if it is using ld.so.conf.d as well - it will read /usr/lib64 entries etc... 
- is there a precise way to see what env is used for galaxy? The log script 
gives you a nice read on the python path but is there a way to see all envs? 
Looking at envs as the user galaxy does not equate what galaxy ends up using.

multiple issues on the CentOS install:

I found the setting or non-setting  leading to the missing rpy module by 
looking at the runner log script - 
while it was loading python2.6.6 it was also loading the site-packages and 
other python parts from /usr/lib64...python2.4 

Once I edited run.sh to use the correct python and correct R path and added the 
RHOME to the rpy dependent scripts - this problem went away seemingly only to 
produce an env looking issue: sh rm command not found when running rpy 
dependent applications. Did somehow the edit destroy the /bin and usr/bin path? 
Would these be set in run.sh as well?   

To your question: 
where do you set RHOME in the env? 

We'd prefer to set all path options in run.sh in case all above is true that 
you cannot set it in ~/.bash* 

best,
joe

From: Nate Coraor [n...@bx.psu.edu]
Sent: Friday, September 02, 2011 1:40 PM
To: Joseph Hargitai
Cc: galaxy-dev@lists.bx.psu.edu
Subject: Re: [galaxy-dev] rpy - No module named rpy CentoOs install

Joseph Hargitai wrote:
 additional info:

 it is possible on the same node to run manually

 ./gsummary.py

 with the header:

 #!/usr/bin/env python

 import sys, re, tempfile
 from rpy_options import set_options
 set_options(RHOME='/apps1/R/2.13.1/intel/lib64/R')
 from rpy import *

 Where else can there be an env setting to prevent this app not finding the 
 mod from within galaxy?

Hi Joe,

If you set RHOME in the environment and then run gsummary.py without the
additions, does it work?

--nate


 j


 
 From: Joseph Hargitai
 Sent: Thursday, September 01, 2011 12:28 PM
 To: galaxy-dev@lists.bx.psu.edu
 Subject: rpy - No module named rpy CentoOs install

 Hi,

 On our Ubuntu install stat packages and all that require rpy work fine.

 On our CentOs install seeing this stubborn error that I did see from previous 
 post to be difficult to fix.

 At first suspected the SGE issue - environment not transferring to  compute 
 nodes. After changing the app to run local had the same issue.

 CentOs: 2.6.18-92.1.13.el5

 rpy module is in:

 /apps1/python/2.6.6/intel/lib/python2.6/site-packages

 _rpy2122.so
 _rpy2131.so

 version:
 [galaxy@compute-0-65 galaxy-dist]$ python -c import rpy; print 
 rpy.__version__
 1.5.1

 path:

 python -c 'import sys; print \n.join( sys.path )'

 /apps1/python/2.6.6/intel/lib/python2.6/site-packages/simplejson-2.0.9-py2.6-linux-x86_64.egg
 /apps1/python/2.6.6/intel/lib/python2.6/site-packages/Sphinx-1.0.7-py2.6.egg
 /apps1/python/2.6.6/intel/lib/python2.6/site-packages/docutils-0.7-py2.6.egg
 /apps1/python/2.6.6/intel/lib/python2.6/site-packages/Jinja2-2.5.5-py2.6.egg
 /apps1/python/2.6.6/intel/lib/python2.6/site-packages/Pygments-1.4-py2.6.egg
 /apps1/python/2.6.6/intel/lib/python2.6/site-packages/nose-1.0.0-py2.6.egg
 /apps1/python/2.6.6/intel/lib/python2.6/site-packages/Traits-3.5.0-py2.6-linux-x86_64.egg
 /apps1/python/2.6.6/intel/lib/python2.6/site-packages/nibabel-1.0.0-py2.6.egg
 /apps1/python/2.6.6/intel/lib/python2.6/site-packages/nipype-0.0.0-py2.6.egg
 /apps1/python/2.6.6/intel/lib/python2.6/site-packages/setuptools-0.6c12dev_r88846-py2.6.egg
 /apps1/python/2.6.6/intel/lib/python2.6/site-packages/birdsuite-1.0-py2.5.egg
 /apps1/python/2.6.6/intel/lib/python2.6/site-packages/mpgutils-0.7-py2.5.egg
 /apps1/python/2.6.6/intel/lib/python26.zip
 /apps1/python/2.6.6/intel/lib/python2.6
 /apps1/python/2.6.6/intel/lib/python2.6/plat-linux2
 /apps1/python/2.6.6/intel/lib/python2.6/lib-tk
 /apps1/python/2.6.6/intel/lib/python2.6/lib-old
 /apps1/python/2.6.6/intel/lib/python2.6/lib-dynload
 /apps1/python/2.6.6/intel/lib/python2.6/site-packages
 /apps1/python/2.6.6/intel/lib/python2.6/site-packages/PIL


 compiled against /R/2.13.1


 env:

 export PATH=.:\
 /apps1/R/2.13.1/intel/bin:\
 /apps1/python/2.6.6/intel/bin:\
 /apps1/pipe/bowtie/0.12.7/intel:\
 /apps1/pipe/bwa/0.5.9/intel:\
 /apps1/samtools/0.1.13/intel/bin:\
 /apps1/fastx_toolkit/0.0.13/intel/bin:\
 /apps1/maq/maq-0.7.1:\
 /apps1/maq/maq-0.7.1/scripts:\
 /apps1/bfast/bfast-0.6.5a/butil:\
 /apps1/bfast/bfast-0.6.5a/scripts:\
 /apps1/abyss/1.2.7/intel/bin:\
 /apps1/velvet/velvet_1.0.12:\
 /apps1/pipe/tophat/1.3.0/intel/bin:\
 /apps1/pipe/cufflinks/1.0.3/intel/bin:\
 /apps1/blast/2.2.25/gnu/bin:\
 /apps1/blast+/2.2.5/gnu/bin:\
 

Re: [galaxy-dev] disk space and file formats

2011-09-02 Thread Edward Kirton
 What, like a BAM file of unaligned reads? Uses gzip compression, and
 tracks the pairing information explicitly :) Some tools will already take
 this as an input format, but not all.

ah, yes, precisely.  i actually think illumina's pipeline produces
files in this format now.
wrappers which create a temporary fastq file would need to be created
but that's easy enough.
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-dev] handling galaxy updates

2011-09-02 Thread Shantanu Pavgi

Thanks for the reply Kanwei.  Is there any announcement or news feed that 
people can subscribe to know when main galaxy instance is updated. This will 
help sites which want to keep up with the PSU's main galaxy instance. Do you 
announce it on the galaxy-user list? 

--
Shantanu. 


On Sep 2, 2011, at 1:57 PM, Kanwei Li wrote:

 Hi Shantanu,
 
 Test usually tracks galaxy-central pretty closely, and we do update main more 
 often than galaxy-dist (you can see the version on main at the index page). 
 If nothing breaks on main for a while we do a galaxy-dist release so we are 
 generally confident that it will be stable.
 
 Thanks,
 
 K
 
 On Fri, Sep 2, 2011 at 2:22 PM, Shantanu Pavgi pa...@uab.edu wrote:
 
 Hi,
 
 I am curious to know which revision of galaxy code gets deployed in the main 
 ( http://main.g2.bx.psu.edu/ ) and test ( http://test.g2.bx.psu.edu/  ) 
 instances of galaxy. I was thinking active development repository 
 galaxy-central code gets deployed in the test galaxy instance and stable 
 galaxy-dist code gets deployed in the main galaxy instance.  However, it 
 seems like main galaxy instance is updated more frequently than galaxy-dist 
 repository. Should local galaxy instances keep up with PSU's main galaxy 
 instance revisions or wait for stable code to be released in galaxy-dist 
 repository? We have been keeping up with galaxy-dist repository and not the 
 galaxy-central repository updates. Are  other sites following similar update 
 model?
 
 --
 Thanks,
 Shantanu.
 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:
 
  http://lists.bx.psu.edu/
 


___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-dev] disk space and file formats

2011-09-02 Thread Fields, Christopher J
On Sep 2, 2011, at 3:02 PM, Edward Kirton wrote:

 What, like a BAM file of unaligned reads? Uses gzip compression, and
 tracks the pairing information explicitly :) Some tools will already take
 this as an input format, but not all.
 
 ah, yes, precisely.  i actually think illumina's pipeline produces
 files in this format now.
 wrappers which create a temporary fastq file would need to be created
 but that's easy enough.

My argument against that is the cost of going from BAM - temp fastq may be 
prohibitive, e.g. the need to generate very large temp fastq files on the fly 
as input for various applications may lead one back to just keeping a permanent 
FASTQ around anyway.  One could probably get better performance out of a 
simpler format that removes most of the 'AM' parts of BAM.  Or is the idea that 
the file itself is modified, like a database?  And how would indexing work (BAM 
uses binning on the match to the reference seq), or does it matter?

I recall hdf5 was planned as an alternate format (PacBio uses it, IIRC), and of 
course there is NCBI's .sra format.  Anyone using the latter two? 

chris


___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-dev] Suggestion / Request for Comments on Galaxy Best Practices - Gradual migration to standard indentation

2011-09-02 Thread Dave Clements
Hello all,

I've created a wiki page on coding best practices to record what is actually
done, and the results of discussions like these:
   http://wiki.g2.bx.psu.edu/Develop/Best Practices

So far, it only lists 2 standards:

1. 4 spaces per indent level
2. Use spaces, not tabs.

I'll continue to watch this list and add best practices accordingly.

Dave C.

-- 
http://galaxyproject.org/
http://getgalaxy.org/
http://usegalaxy.org/
http://galaxyproject.org/wiki/
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

[galaxy-dev] downloading bowtie indexed files

2011-09-02 Thread Nikhil Joshi
Hi all,

Just wondering if there is a way to download the bowtie indexed files after
indexing.  It seems that the indexed output is simply a meta-file that
points to the directory where the indexed files are kept but what if I
want to download all of the indexed files themselves?  I wrote something to
do this... basically I created an html page that is the output of
bowtie-build which then points to the created files... but I'm wondering if
there is an easier way...?

- Nik.
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] disk space and file formats

2011-09-02 Thread Edward Kirton
 i actually think illumina's pipeline produces files in this format 
(unaligned-bam) now.

 Oh do they? - that's interesting. Do you have a reference/link?

i caught wind of this at the recent illumina user's conference but i
asked someone in our sequencing team to confirm and he hadn't heard of
this.  it must be limited to the forthcoming miseq sequencer for the
timebeing, but may make it's way to the big sequencers later.
apparently illumina is thinking about storage as well.  i seem to
recall the speaker saying they won't produce srf files anymore, but
again, this was a talk about the miseq so may not apply to the other
sequencers.

 wrappers which create a temporary fastq file would need to be created
 but that's easy enough.

 My argument against that is the cost of going from BAM - temp
 fastq may be prohibitive, e.g. the need to generate very large
 temp fastq files on the fly as input for various applications may
 lead one back to just keeping a permanent FASTQ around anyway.

 True - if you can't update the tools you need to take BAM.
 In some cases at least you can pipe the gzipped FASTQ
 into alignment tools which accepts FASTQ on stdin, so
 there is no temp file per se.

the tools really do need to support the format; the tmpfile was simply
a workaround.  some tools already support bam, more currently support
fastq.gz.  (someone here made the wrong bet years ago and had adopted
a site-wide fastq.bz2 standard which only recently changed to
fastq.gz.)  but if illumina does start producing bam files in the
future, then we can expect more tools to support that format.  until
they do, probably fastq.gz is a safe bet.

of course there is a computational cost to compressing/uncompressing
files but that's probably better than storing unnecessarily huge
files.  it's a trade-off.

similarly, there's a trade-off involved in limiting read qc tools to a
single/few big tools which wrap several tools, with many options.
users can't play around with read qc but that may be too expensive
(computationally and storage-wise).  for the most part, a standard qc
will do.  one can spend a lot of time and effort to squeeze a bit more
useful data out of a bad library, for example, when they probably
should have just sequenced another library.  i favor leaving the
playing around to the rd/development/qc team and just offering a
canned/vetted qc solution to the average user.

 I recall hdf5 was planned as an alternate format (PacBio uses
 it, IIRC), and of course there is NCBI's .sra format.  Anyone
 using the latter two?
 Moving from the custom BGZF modified gzip format used in
 BAM to HD5 has been proposed on the samtools mailing list
 (as Chris knows), and there is a proof of principle implementation
 too in BioHDF, http://www.hdfgroup.org/projects/biohdf/
 The SAM/BAM group didn't seem overly enthusiastic though.
 For the NCBI's .sra format, there is no open specification, just
 their public domain source code:
 http://seqanswers.com/forums/showthread.php?t=12054

i believe hdf5 is an indexed data structure which, as you mentioned,
isn't required for unprocessed reads.

since i'm rapidly running out of storage, i think the best immediate
solution for me is to deprecate all the fastq datatypes in favor of a
new fastqsangergz and to bundle the read qc tools to eliminate
intermediate files.  sure, users won't be able to play around with
their data as much, but my disk is 88% full and my cluster has been
100% occupied for 2-months straight, so less choice is probably
better.

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-dev] Problems with load_workflow_editor

2011-09-02 Thread shashi shekhar
No, it is happening only  my instance of galaxy . i am using older version
of galaxy . can it happen some browser won't support galaxy old version .



On Sat, Sep 3, 2011 at 12:30 AM, Kanwei Li kan...@gmail.com wrote:

 Hi Shashi,

 Does this only happen on your instance or does it happen on our public
 instance as well?

 Thanks,

 K

 On Fri, Sep 2, 2011 at 6:46 AM, shashi shekhar meshash...@gmail.comwrote:

 Hi All,

 i am not able to create workflow in my local instance of galaxy. it's
 displaying only loading picture on browser when i am clicking on edit option
 of workflow . how can we resolve such type of problem ? i am using old
 version version galaxy.



 192.168.60.115, 145.139.1.156 - - [02/Sep/2011:16:13:50 +0600] GET
 /workflow/get_datatypes?_=1314940546255 HTTP/1.1 200 - 
 http://garu.ac.in/workflow/editor?id=df7a1f0c02a5b08e; Mozilla/5.0
 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.17) Gecko/20110420
 Firefox/3.6.17
 192.168.60.115, 145.139.1.156 - - [02/Sep/2011:16:13:51 +0600] GET
 /workflow/load_workflow?_=1314940546462id=df7a1f0c02a5b08e_=true HTTP/1.1
 200 - http://garu.ac.in/workflow/editor?id=df7a1f0c02a5b08e;
 Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.17) Gecko/20110420
 Firefox/3.6.17



 Regards
 shashi shekhar

 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/



___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] disk space and file formats

2011-09-02 Thread Peter Cock
On Saturday, September 3, 2011, Edward Kirton eskir...@lbl.gov wrote:
 of course there is a computational cost to compressing/uncompressing
 files but that's probably better than storing unnecessarily huge
 files.  it's a trade-off.

It may still be faster due to less IO, probably depends on your hardware.

 since i'm rapidly running out of storage, i think the best immediate
 solution for me is to deprecate all the fastq datatypes in favor of a
 new fastqsangergz and to bundle the read qc tools to eliminate
 intermediate files.  sure, users won't be able to play around with
 their data as much, but my disk is 88% full and my cluster has been
 100% occupied for 2-months straight, so less choice is probably
 better.

In your position I agree that is a pragmatic choice. You might be able to
modify the file upload code to gzip any FASTQ files... that would prevent
uncompressed FASTQ getting into new histories.

I wonder if Galaxy would benefit from a new fastqsanger-gzip (etc) datatype?
However this seems generally useful (not just for FASTQ) so perhaps a more
general mechanism would be better where tool XML files can say which file
types they accept and which of those can/must be compressed (possily not
just gzip format?).

Peter
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] disk space and file formats

2011-09-02 Thread Fields, Christopher J
On Sep 2, 2011, at 8:02 PM, Peter Cock wrote:

 On Fri, Sep 2, 2011 at 9:27 PM, Fields, Christopher J
 cjfie...@illinois.edu wrote:
 On Sep 2, 2011, at 3:02 PM, Edward Kirton wrote:
 
 What, like a BAM file of unaligned reads? Uses gzip compression, and
 tracks the pairing information explicitly :) Some tools will already take
 this as an input format, but not all.
 
 ah, yes, precisely.  i actually think illumina's pipeline produces
 files in this format now.
 
 Oh do they? - that's interesting. Do you have a reference/link?
 
 wrappers which create a temporary fastq file would need to be created
 but that's easy enough.
 
 My argument against that is the cost of going from BAM - temp
 fastq may be prohibitive, e.g. the need to generate very large
 temp fastq files on the fly as input for various applications may
 lead one back to just keeping a permanent FASTQ around anyway.
 
 True - if you can't update the tools you need to take BAM.
 In some cases at least you can pipe the gzipped FASTQ
 into alignment tools which accepts FASTQ on stdin, so
 there is no temp file per se.

Some applications (Velvet for instance) accept gzipped FASTQ, though they may 
turn around and dump the data out uncompressed.

  One could probably get better performance out of a simpler
 format that removes most of the 'AM' parts of BAM.
 
 Yes, but that meaning inventing yet another file format. At least
 gzipped FASTQ is quite straightforward.

Yes.

 Or is the idea that the file itself is modified, like a database?
 
 That would be quite a dramatic change from the current
 Galaxy workflow system - I doubt that would be acceptable
 in general.

My thought as well.

 And how would indexing work (BAM uses binning on the
 match to the reference seq), or does it matter?
 
 BAM indexing as done in samtools/picard is only for the aligned
 reads - so no help for a BAM file of unaligned reads. You could
 use a different indexing system (e.g. by read name) and the
 same BAM BGZF block offset system (I've tried this as an
 experiment with Biopython's SQLite indexing of sequence files).
 
 However, for tasks taking unaligned reads as input, you
 generally just iterate over the reads in the order on disk.

I think, unless there is a demonstrable advantage to using unaligned BAM, 
fastq.gz is the easiest.

 I recall hdf5 was planned as an alternate format (PacBio uses
 it, IIRC), and of course there is NCBI's .sra format.  Anyone
 using the latter two?
 
 Moving from the custom BGZF modified gzip format used in
 BAM to HD5 has been proposed on the samtools mailing list
 (as Chris knows), and there is a proof of principle implementation
 too in BioHDF, http://www.hdfgroup.org/projects/biohdf/
 The SAM/BAM group didn't seem overly enthusiastic though.

Probably not, as it is somewhat a competitor of SAM/BAM (a bit broader in 
scope, beyond just alignments).  As Peter indicated, I know the BioHDF folks 
(they are here in town); however, my actual question was whether anyone is 
actually using HDF5 or SRA in production?  I haven't seen adoption beyond 
PacBio, but I have seen some things popping up in Galaxy.

 For the NCBI's .sra format, there is no open specification, just
 their public domain source code:
 http://seqanswers.com/forums/showthread.php?t=12054
 
 Regards,
 
 Peter

Simply gzipping FASTQ seems to give better compression that an .lite.sra file 
(and I'm not a happy user of their SRA toolset).  And of course there is 
parallel gzip...

chris


___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/