[galaxy-dev] file upload/unzip issues

2012-12-08 Thread Michael Moore
Hello,

I and other members of my lab are encountering issues uploading (some) files to 
the Main public Galaxy server.  We routinely upload from our local server to 
main.g2.bx.psu.edu via ftp using Cyberduck.  In my case, on 12/6/12 I uploaded 
12 zipped fastq files.  Of these, 9 were completely fine.  For the remaining 3 
files, the transfer through Cyberduck appeared to work fine, and the files 
appeared (with the correct file size) on the upload screen under Get data as 
usual.  However, once the files were uploaded into a Galaxy history, they were 
empty with a message saying Problem decompressing gzipped data.  An example 
is entry #45: '1PositiveRFP92112Pool41_ATCACG_L002_R1_001.fastq.gz' in my 
history called Tcell_120812.  My account uses this email address as the login 
ID (mmo...@rockefeller.edu).

All of these files are pretty large (from 5 to 10 GB once unzipped), but the 
failure/success did not appear to correlate to the size of the file.  I am not 
exceeding my space quota. In addition, I was able to successfully upload the 
file listed above (1PositiveRFP92112Pool41_ATCACG_L002_R1_001.fastq.gz) to our 
local Galaxy installation, so I don't think there is anything wrong with the 
file itself.  However, it has failed to upload to the public galaxy server 
multiple times.  The experience of my lab mates is similar; some of their files 
are uploading correctly, and others are having the same problem I described 
above.  We can't seem to find anything common among the files that are failing 
to upload.  For instance, they span different sequencing runs and were 
deposited on our local server at different times

Has anyone else been encountering similar issues ?  Please let me know if I 
should direct this question elsewhere or if I can provide any further 
information.  Thanks very much for your time.

Michael Moore
Rockefeller University


___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

[galaxy-dev] job terminating on warnings from software

2012-05-04 Thread Michael Moore
I have a long string of steps dying on the last step because GNUPLOT sends
a warning to the log which galaxy faithfully records 11 faithful times in
my history files.  The 10 other files have downloadable content, and in
fact, outside of galaxy the plot works.  It is simply changing some
intervals.

How can I tell galaxy to keep going unless I get an actual error?

Michael
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

[galaxy-dev] Spooky behavior

2012-05-04 Thread Michael Moore
OK data library, (brand new).  I upload one BAM file using galaxy's Add
Dataset

I start a new history, titling it, testing multiple use. I import to
current history.  I run a job that reads the BAM file and produces some
output.

Everything works.  I have 13 output data sets in the history, all showing
green.

Now I create a second history, called 2nd use same dataset same library
Again I check the box and click import to current history
Then I try to run the same job as before...

[bam_header_read] bgzf_check_EOF: Invalid argument
[bam_header_read] invalid BAM binary header (this is not a BAM file).
[main_samview] fail to read the header from
/.../galaxy/database/files/000/dataset_961.dat  (Ellipsis is mine, not
Python's)

Oh my, looking in that file space, I discover that my BAM file is at
dataset_962.dat.

Was gibt?

Michael Moore
.
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] /bin/sh: samtools: not found--WORKAROUND

2012-04-30 Thread Michael Moore
Yeah, I was using ['printenv| mail myaddress  samtools'] inside the
subprocess.Popen(['samtools'],... in upload.py.

Now right before in upload.py before the subprocess was called, I had
/usr/bin/ in my PATH, but inside the subprocess the story was different.  I
just used the symbolic link to access samtools from something that remained
in the PATH.  I have no idea how this could happen, but I have noticed that
galaxy does things with input and output and seems to manipulate the
environment heavily--but why should a child process have a different
environment when none was invoked?  I'll figure that out later.  I am still
trying to convert some software to run with galaxy.

On Mon, Apr 30, 2012 at 8:57 AM, Nate Coraor n...@bx.psu.edu wrote:

 On Apr 24, 2012, at 8:36 PM, Michael Moore wrote:

  There is apparently a persistent problem with samtools which normally
 lives at /usr/bin/samtools.  I encountered a similar problem in Python when
 uploading BAM files.
 
  I did not resolve the problem.  I hacked for a while on binary.py in a
 lib/ subdirectory and used os.system to send myself mail describing the
 effective path at various points, and I added a missing
 
  logging.basicConfig()
 
  statement and scattered some log.WARNING statements strategically.  All
 this told me nothing.  So I made a few symlinks to samtools.  The one that
 got things working was
 
  ln -s /usr/bin/samtools /home/galaxy/bin/samtools
 
  so--worked around but not resolved.

 Hi Michael,

 For tools that output BAM, samtools needs to be in your $PATH, or has to
 be set up via the tool dependencies system.  See the following for details:

http://wiki.g2.bx.psu.edu/Admin/Config/Tool%20Dependencies

 For SGE, you can modify the $PATH used on the cluster in ~/.sge_request or
 the file specified in the 'environment_setup_file' galaxy config option.

 --nate

 
  Michael
 
  On Tue, Apr 17, 2012 at 12:15 PM, zhengqiu cai caizhq2...@yahoo.com.cn
 wrote:
  Hi All,
 
  I submitted a job to convert sam to bam, and the job was running forever
 without outputing the result. I then checked the log, and it read:
  Traceback (most recent call last):
   File /mnt/galaxyTools/galaxy-dist/lib/galaxy/jobs/runners/drmaa.py,
 line 336, in finish_job
 drm_job_state.job_wrapper.finish( stdout, stderr )
   File /mnt/galaxyTools/galaxy-dist/lib/galaxy/jobs/__init__.py, line
 637, in finish
 dataset.set_meta( overwrite = False )
   File /mnt/galaxyTools/galaxy-dist/lib/galaxy/model/__init__.py, line
 875, in set_meta
 return self.datatype.set_meta( self, **kwd )
   File /mnt/galaxyTools/galaxy-dist/lib/galaxy/datatypes/binary.py,
 line 179, in set_meta
 raise Exception, Error Setting BAM Metadata: %s % stderr
  Exception: Error Setting BAM Metadata: /bin/sh: samtools: not found
 
  It means that the samtools is not in the PATH. I tried to set the PATH
 in a couple of methods according the Galaxy documentation:
  1. put the path in the env.sh in the tool directory and symbolink
 default to the tool directory, e.g. default -
 =/mnt/galaxyTools/tools/samtools/0.1.18
  2. put -v PATH=/mnt/galaxyTools/tools/samtools/0.1.18 in ~/.sge_request
  3. put -v PATH=/mnt/galaxyTools/tools/samtools/0.1.18 in
 /path/sge_request
 
  none of them worked, and I got the above same problem.
 
  Then I checked the job log file in the job_working_directory, and it
 read:
  Samtools Version: 0.1.18 (r982:295)
  SAM file converted to BAM
 
  which shows that sge knows the PATH of samtools. To double check it, I
 added samtools index to Galaxy, and it worked well. I am very confused why
 SGE knows the tool path but cannot run the job correctly.
 
  The system I am using is ubuntu on EC2. I checked out the code from
 galaxy-dist on bitbucket. Other tools such as bwa and bowtie worked well
 using the same setting method(put env.sh in the tools directory to set the
 tool path)
 
  Thank you very much for any help or hints.
 
  Cai
 
  ___
  Please keep all replies on the list by using reply all
  in your mail client.  To manage your subscriptions to this
  and other Galaxy lists, please use the interface at:
 
   http://lists.bx.psu.edu/
 
  ___
  Please keep all replies on the list by using reply all
  in your mail client.  To manage your subscriptions to this
  and other Galaxy lists, please use the interface at:
 
   http://lists.bx.psu.edu/


___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

[galaxy-dev] library_import_dir -- How is it supposed to work?

2012-04-26 Thread Michael Moore
I set library_import_dir to a path and tried uploading a directory of bam
files.  After fixing the situation so galaxy could find samtools in that
subshell, I was able to upload links to the history.  But moving things to
one directory did not appear to be terribly useful, so I tested what
happened if I had subdirectories existing in library import directory.

Test 1 I used folders u1 and u2, each with data, and some data in the root
library_import_directory.  After clearing the samtools eror I was presented
with a drop-down list with choices 'None', 'u1' and 'u2'.  Selecting 'None'
did not result in seeking data but a sharp reminder that I had to pick a
directory.  Selecting those directories led to uploads, with the Non-Copy
correctly sized and even downloadable from the data library, but NOT usable
in the history, because upload.py decided the file did not exist (probably
the 'path' variable in os.path.exists() line 99).

It also became apparent that the upload would look one level down from the
root directory and no further (tested by adding u3 with data and a
subdirectory of u3 called v3, also with data.

So the  current state of affairs is that it is a single directory to which
one must move files in order to upload links to a galaxy library or folders
thereof..  OR, alternatively, make a directory of links called directory A,
and then another directory of links to the links in directory A at the
library_import_dir and then ask Galaxy to copy the data.  (Not fully tested
yet).

But it is apparent from the UI that more utility was intended.  If I have
time, I will help with that.

Michael Moore
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

[galaxy-dev] How does one survive the updates?

2012-04-25 Thread Michael Moore
I was running a single instance of galaxy on my own machine, playing with
library_import and figuring why BAM files were getting errors on upload.
(It is a path matter, when one drops into the subshell and the workaround
is

ln -s /usr/bin/samtools /home/galaxy/bin/samtools

)

Anyway, I seemed to have it figured last eve so I shut down the notebook
where galaxy was running (on RHEL6) and went home.  This am when I started
galaxy with sh run.sh I got some messages about egg,ini and replacing
universe_wsgi. files from universe_wsgi.sample. files and nothing worked.
My registration was gone.  My admin_user was overwritten.  I re-registered,
and restored the file settings and restarted galaxy, but now no login
sticks.  It does not complain about a registered user, but it does not
appear to hold the session--the user tab does not show me as logged in.  Is
there some different way it is handling cookies?

Michael Moore
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] /bin/sh: samtools: not found--WORKAROUND

2012-04-24 Thread Michael Moore
There is apparently a persistent problem with samtools which normally lives
at /usr/bin/samtools.  I encountered a similar problem in Python when
uploading BAM files.

I did not resolve the problem.  I hacked for a while on binary.py in a lib/
subdirectory and used os.system to send myself mail describing the
effective path at various points, and I added a missing

logging.basicConfig()

statement and scattered some log.WARNING statements strategically.  All
this told me nothing.  So I made a few symlinks to samtools.  The one that
got things working was

ln -s /usr/bin/samtools /home/galaxy/bin/samtools

so--worked around but not resolved.

Michael

On Tue, Apr 17, 2012 at 12:15 PM, zhengqiu cai caizhq2...@yahoo.com.cnwrote:

 Hi All,

 I submitted a job to convert sam to bam, and the job was running forever
 without outputing the result. I then checked the log, and it read:
 Traceback (most recent call last):
  File /mnt/galaxyTools/galaxy-dist/lib/galaxy/jobs/runners/drmaa.py,
 line 336, in finish_job
drm_job_state.job_wrapper.finish( stdout, stderr )
  File /mnt/galaxyTools/galaxy-dist/lib/galaxy/jobs/__init__.py, line
 637, in finish
dataset.set_meta( overwrite = False )
  File /mnt/galaxyTools/galaxy-dist/lib/galaxy/model/__init__.py, line
 875, in set_meta
return self.datatype.set_meta( self, **kwd )
  File /mnt/galaxyTools/galaxy-dist/lib/galaxy/datatypes/binary.py, line
 179, in set_meta
raise Exception, Error Setting BAM Metadata: %s % stderr
 Exception: Error Setting BAM Metadata: /bin/sh: samtools: not found

 It means that the samtools is not in the PATH. I tried to set the PATH in
 a couple of methods according the Galaxy documentation:
 1. put the path in the env.sh in the tool directory and symbolink default
 to the tool directory, e.g. default -
 =/mnt/galaxyTools/tools/samtools/0.1.18
 2. put -v PATH=/mnt/galaxyTools/tools/samtools/0.1.18 in ~/.sge_request
 3. put -v PATH=/mnt/galaxyTools/tools/samtools/0.1.18 in /path/sge_request

 none of them worked, and I got the above same problem.

 Then I checked the job log file in the job_working_directory, and it read:
 Samtools Version: 0.1.18 (r982:295)
 SAM file converted to BAM

 which shows that sge knows the PATH of samtools. To double check it, I
 added samtools index to Galaxy, and it worked well. I am very confused why
 SGE knows the tool path but cannot run the job correctly.

 The system I am using is ubuntu on EC2. I checked out the code from
 galaxy-dist on bitbucket. Other tools such as bwa and bowtie worked well
 using the same setting method(put env.sh in the tools directory to set the
 tool path)

 Thank you very much for any help or hints.

 Cai

 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] Problem with cleaning up galaxy datasets

2012-04-13 Thread Michael Moore
Yes, you do not have a url like http://some_path or ssh://some_path
defined in your config file.  Look at the command-line options for
your routines and figure out how it is finding your config file.  It
could be looking in the wrong place or you could have an error in the
config file (missing piece or undefined variable which would default
to False (a boolean).

On 4/11/12, Klaus Metzeler m...@klaus-metzeler.de wrote:

 Dear all,

 I have a problem with the cleanup scripts on my local galaxy instance.
 I am using the updated cleanup_datasets.py, as per Nate's earlier reply
 here
 http://gmod.827538.n3.nabble.com/Problem-running-purge-datasets-sh-cleanup-scripts-td3688016.html#none.
 http://gmod.827538.n3.nabble.com/Problem-running-purge-datasets-sh-cleanup-scripts-td3688016.html#none

 This is the output I get when running cleanup_datasets.py:

   ~/ngs-bin/galaxy-dist $ scripts/cleanup_datasets/cleanup_datasets.py
 -d 2 -6 -r
 Traceback (most recent call last):
File scripts/cleanup_datasets/cleanup_datasets.py, line 524, in
 module
  if __name__ == __main__: main()
File scripts/cleanup_datasets/cleanup_datasets.py, line 82, in main
  ini_file = args[0]
 IndexError: list index out of range

 ... and this if I call it via the shell script cleanup_datasets.sh

 ~/ngs-bin/galaxy-dist $ scripts/cleanup_datasets/delete_datasets.sh
 Traceback (most recent call last):
File ./scripts/cleanup_datasets/cleanup_datasets.py, line 524, in
 module
  if __name__ == __main__: main()
File ./scripts/cleanup_datasets/cleanup_datasets.py, line 101, in main
  app = CleanupDatasetsApplication( config )
File ./scripts/cleanup_datasets/cleanup_datasets.py, line 512, in
 __init__
  self.model = galaxy.model.mapping.init( config.file_path,
 config.database_connection, engine_options={}, create_tables=False,
 object_store=self.object_store )
File
 /home/klausmetzeler/ngs-bin/galaxy-dist/lib/galaxy/model/mapping.py,
 line 1818, in init
  load_egg_for_url( url )
File
 /home/klausmetzeler/ngs-bin/galaxy-dist/lib/galaxy/model/mapping.py,
 line 1798, in load_egg_for_url
  dialect = guess_dialect_for_url( url )
File
 /home/klausmetzeler/ngs-bin/galaxy-dist/lib/galaxy/model/mapping.py,
 line 1794, in guess_dialect_for_url
  return (url.split(':', 1))[0]
 AttributeError: 'bool' object has no attribute 'split'


 Any idea what might be wrong?
 Thanks a lot for your support,
 Klaus



___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


[galaxy-dev] Toolshed tribulations--certain types seem unsupported

2012-04-09 Thread Michael Moore
I am testing galaxy for wide use, and I have legacy text files that call
algorithms, sorts, and displays.  One such has a tool.xml file with 10
parameters, one select, three integer, and one float, with the rest text
for the moment.  (some will be type=data later if the runs equate to the
runs we do outside galaxy)

The tool does not show up.  Firefox, emacs and vim all agree that it is
well-formed, and galaxy has been properly bounced.  I experimented with
removing parameters and found with 7 parameters, I did not have the
problem, then I noticed that all of them had been changed to text or select
in my desperation to make it show up for placement on the workflow.  I
returned to 10 parameters, but this time all type=select and type=text,
and everything worked.  But slipping even one integer, even with the
(optional) min max and default tags, and the tool would disappear on
restart.

Am I looking at a bug, or is there something I need to be doing to make
this tool visible with numeric parameters?

MGM
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/