Hello Hans-Rudolph,

On Wed, Apr 25, 2012 at 1:37 AM, Hans-Rudolf Hotz <h...@fmi.ch> wrote:
> Hi Dan
>
>
> There seems to be several issues connected with each other or not, I don't
> know.....
>
>
> - Let's start with the 'curiosity': Do you get this problem with any
>  tool?

No. And, I should be a bit more specific about what happens. When I
click "execute" I see this:

The following job has been successfully added to the queue:

36: RNASeq tool on data 3 and data 4
37: RNASeq tool on data 3 and data 4


Does it also happen with a 'simple' (ie not using R) tool you
>  add?
>

No.

> - When you execute your R script on the command line, are you running
>  it as the same user as Galaxy executes the job?

Yes.


>
>
> - we execute R scripts the following way:
>
>  <command>Rscript --vanilla /full path to script/script.R -n $name
>     -i $infile -o $outfile</command>
>
>  and in the R script we use the library 'getopt'
>
>
>

Good idea. Taking a closer, look, I found my R script was not
expecting arguments in the same order that the XML script provided
them in.
I fixed that and now the R script runs without warning or error, but I
still get the "8-bit bytestrings" error, and the error "Unable to
finish job".

However, by printing out the arguments within R, I can see the name of
the output files that are supposed to be generated, and it looks like
they are generated. So why doesn't the job finish?

I think the 8 bit bytestrings error might be because R outputs
backticks (`) and they show up escaped in the galaxy error like this:
\xe2\x80\x98.

Is there some way that I can set my LOCALE so that R uses an encoding
that galaxy is happier with? Or perhaps I can get R to stop outputting
those backticks somehow.

> I hope this get's you a little bit further.


Thanks much,
Dan


>
>
> Regards, Hans
>
>
>
>
>
> On 04/24/2012 11:56 PM, Dan Tenenbaum wrote:
>>
>> Apologies for originally posting this to galaxy-user; now I realize it
>> belongs here.
>>
>> Hello,
>>
>> I'm a galaxy newbie and running into several issues trying to adapt an
>> R script to be a galaxy tool.
>>
>> I'm looking at the XY plotting tool for guidance
>> (tools/plot/xy_plot.xml), but I decided not to embed my script in XML,
>> but instead have it in a separate script file, that way I can still
>> run it from the command line and make sure it works as I make
>> incremental changes. (So my script starts with args<-
>> commandArgs(TRUE)). Also, if it doesn't work, this suggests to me that
>> there is a problem with my galaxy configuration.
>>
>> First, I tried using the r_wrapper.sh script that comes with the XY
>> plotting tool,  but it threw away my arguments:
>>
>> An error occurred running this job: ARGUMENT
>> '/Users/dtenenba/dev/galaxy-dist/database/files/000/dataset_4.dat'
>> __ignored__
>>
>> ARGUMENT
>> '/Users/dtenenba/dev/galaxy-dist/database/files/000/dataset_3.dat'
>> __ignored__
>>
>> ARGUMENT 'Fly' __ignored__
>>
>> ARGUMENT 'Tagwise' __ignored__
>>
>> etc.
>>
>> So then I tried just switching to Rscript:
>>
>>   <command interpreter="bash">Rscript RNASeq.R $countsTsv $designTsv
>> "$organism" $dispersion $minimumCountsPerMillion
>> $minimumSamplesPerTranscript $out_file1 $out_file2</command>
>>
>> (My script produces as output a csv file and a pdf file. The final two
>> arguments I'm passing are the names of those files.)
>>
>> But then I get an error that Rscript can't be found.
>>
>> So I wrote a little wrapper script, Rscript_wrapper.sh:
>>
>> #!/bin/sh
>>
>> Rscript $*
>>
>> And called that:
>>   <command interpreter="bash">Rscript_wrapper.sh RNASeq.R $countsTsv
>> $designTsv "$organism" $dispersion $minimumCountsPerMillion
>> $minimumSamplesPerTranscript $out_file1 $out_file2</command>
>>
>> Then I got an error that RNASeq.R could not be found.
>>
>> So then I added the absolute path to my R script to the<command>  tag.
>> This seemed to work (that is, it got me further, to the next error),
>> but I'm not sure why I had to do this; in all the other tools I'm
>> looking at, the directory to the script to run does not have to be
>> specified; I assumed that the command would run in the appropriate
>> directory.
>>
>> So now I've specified the full path to my R script:
>>
>>   <command interpreter="bash">Rscript_wrapper.sh
>> /Users/dtenenba/dev/galaxy-dist/tools/bioc/RNASeq.R $countsTsv
>> $designTsv "$organism" $dispersion $minimumCountsPerMillion
>> $minimumSamplesPerTranscript $out_file1 $out_file2</command>
>>
>> And I get the following long error, which includes all of the output
>> of my R script:
>>
>> Traceback (most recent call last):
>>   File "/Users/dtenenba/dev/galaxy-dist/lib/galaxy/jobs/runners/local.py",
>> line 133, in run_job
>>     job_wrapper.finish( stdout, stderr )
>>   File "/Users/dtenenba/dev/galaxy-dist/lib/galaxy/jobs/__init__.py",
>> line 725, in finish
>>     self.sa_session.flush()
>>   File
>> "/Users/dtenenba/dev/galaxy-dist/eggs/SQLAlchemy-0.5.6_dev_r6498-py2.7.egg/sqlalchemy/orm/scoping.py",
>> line 127, in do
>>     return getattr(self.registry(), name)(*args, **kwargs)
>>   File
>> "/Users/dtenenba/dev/galaxy-dist/eggs/SQLAlchemy-0.5.6_dev_r6498-py2.7.egg/sqlalchemy/orm/session.py",
>> line 1356, in flush
>>     self._flush(objects)
>>   File
>> "/Users/dtenenba/dev/galaxy-dist/eggs/SQLAlchemy-0.5.6_dev_r6498-py2.7.egg/sqlalchemy/orm/session.py",
>> line 1434, in _flush
>>     flush_context.execute()
>>   File
>> "/Users/dtenenba/dev/galaxy-dist/eggs/SQLAlchemy-0.5.6_dev_r6498-py2.7.egg/sqlalchemy/orm/unitofwork.py",
>> line 261, in execute
>>     UOWExecutor().execute(self, tasks)
>>   File
>> "/Users/dtenenba/dev/galaxy-dist/eggs/SQLAlchemy-0.5.6_dev_r6498-py2.7.egg/sqlalchemy/orm/unitofwork.py",
>> line 753, in execute
>>     self.execute_save_steps(trans, task)
>>   File
>> "/Users/dtenenba/dev/galaxy-dist/eggs/SQLAlchemy-0.5.6_dev_r6498-py2.7.egg/sqlalchemy/orm/unitofwork.py",
>> line 768, in execute_save_steps
>>     self.save_objects(trans, task)
>>   File
>> "/Users/dtenenba/dev/galaxy-dist/eggs/SQLAlchemy-0.5.6_dev_r6498-py2.7.egg/sqlalchemy/orm/unitofwork.py",
>> line 759, in save_objects
>>     task.mapper._save_obj(task.polymorphic_tosave_objects, trans)
>>   File
>> "/Users/dtenenba/dev/galaxy-dist/eggs/SQLAlchemy-0.5.6_dev_r6498-py2.7.egg/sqlalchemy/orm/mapper.py",
>> line 1413, in _save_obj
>>     c = connection.execute(statement.values(value_params), params)
>>   File
>> "/Users/dtenenba/dev/galaxy-dist/eggs/SQLAlchemy-0.5.6_dev_r6498-py2.7.egg/sqlalchemy/engine/base.py",
>> line 824, in execute
>>     return Connection.executors[c](self, object, multiparams, params)
>>   File
>> "/Users/dtenenba/dev/galaxy-dist/eggs/SQLAlchemy-0.5.6_dev_r6498-py2.7.egg/sqlalchemy/engine/base.py",
>> line 874, in _execute_clauseelement
>>     return self.__execute_context(context)
>>   File
>> "/Users/dtenenba/dev/galaxy-dist/eggs/SQLAlchemy-0.5.6_dev_r6498-py2.7.egg/sqlalchemy/engine/base.py",
>> line 896, in __execute_context
>>     self._cursor_execute(context.cursor, context.statement,
>> context.parameters[0], context=context)
>>   File
>> "/Users/dtenenba/dev/galaxy-dist/eggs/SQLAlchemy-0.5.6_dev_r6498-py2.7.egg/sqlalchemy/engine/base.py",
>> line 950, in _cursor_execute
>>     self._handle_dbapi_exception(e, statement, parameters, cursor,
>> context)
>>   File
>> "/Users/dtenenba/dev/galaxy-dist/eggs/SQLAlchemy-0.5.6_dev_r6498-py2.7.egg/sqlalchemy/engine/base.py",
>> line 931, in _handle_dbapi_exception
>>     raise exc.DBAPIError.instance(statement, parameters, e,
>> connection_invalidated=is_disconnect)
>> ProgrammingError: (ProgrammingError) You must not use 8-bit
>> bytestrings unless you use a text_factory that can interpret 8-bit
>> bytestrings (like text_factory = str). It is highly recommended that
>> you instead just switch your application to Unicode strings. u'UPDATE
>> job SET update_time=?, stdout=?, stderr=? WHERE job.id = ?'
>> ['2012-04-24 18:55:45.791417', '', 'BiocInstaller version 1.5.7,
>> ?biocLite for help\nWarning message:\nNAs introduced by coercion
>> \nLoading required package: methods\nLoading required package:
>> limma\nLoading required package: BiasedUrn\nLoading required package:
>> geneLenDataBase\nLoading required package: org.Dm.eg.db\nLoading
>> required package: AnnotationDbi\nLoading required package:
>> BiocGenerics\n\nAttaching package:
>> \xe2\x80\x98BiocGenerics\xe2\x80\x99\n\nThe following object(s) are
>> masked from \xe2\x80\x98package:stats\xe2\x80\x99:\n\n    xtabs\n\nThe
>> following object(s) are masked from
>> \xe2\x80\x98package:base\xe2\x80\x99:\n\n    anyDuplicated, cbind,
>> colnames, duplicated, eval, Filter, Find,\n    get, intersect, lapply,
>> Map, mapply, mget, order, paste, pmax,\n    pmax.int, pmin, pmin.int,
>> Position, rbind, Reduce, rep.int,\n    rownames, sapply, setdiff,
>> table, tapply, union, unique\n\nLoading required package:
>> Biobase\nWelcome to Bioconductor\n\n    Vignettes contain introductory
>> material; view with\n    \'browseVignettes()\'. To cite Bioconductor,
>> see\n    \'citation("Biobase")\', and for packages
>> \'citation("pkgname")\'.\n\nLoading required package:
>> DBI\n\nCalculating library sizes from column totals.\nError in
>> matrix(u, nrow = nrows, byrow = TRUE) : \n  negative extents to
>> matrix\nCalls: plotMDS.DGEList ... equalizeLibSizes ->  splitIntoGroups
>> ->  lapply ->  FUN ->  matrix\nExecution halted\n', 15]
>>
>> Note that if I run my script from the command line:
>>
>> ./Rscript_wrapper.sh RNASeq.R
>> /Users/dtenenba/dev/galaxy-dist/database/files/000/dataset_4.dat
>> /Users/dtenenba/dev/galaxy-dist/database/files/000/dataset_3.dat Fly 1
>> 1 Tagwise MDSPlot.pdf outputs.csv
>>
>> It works fine and does not produce a warning about "NAs introduced by
>> coercion", nor does it fail with the "Error in matrix" above.
>>
>> So, can anyone tell me what is going wrong here? Why does R behave
>> differently in galaxy than it does on the command line? (I'm using the
>> same instance of R, same machine, for my galaxy and command-line
>> efforts). Is this 8-bit bytestring error a red herring? Can I filter
>> it so that galaxy is happy?
>>
>> Finally, one other curiosity. Every time I hit "Execute" in galaxy to
>> run my tool, it is run twice--two jobs are created (which each fail in
>> the same way). Why is this?
>>
>> My R script:
>> https://gist.github.com/2482783
>>
>> My XML file:
>> https://gist.github.com/2482792
>>
>> I can share more data (such as sample input files) if necessary.
>>
>> Thanks for your help.
>> Dan
>> ___________________________________________________________
>> Please keep all replies on the list by using "reply all"
>> in your mail client.  To manage your subscriptions to this
>> and other Galaxy lists, please use the interface at:
>>
>>   http://lists.bx.psu.edu/

___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Reply via email to