I tried different combinations ...

The

DEFAULT_ENCODING = 'latin-1'

in

lib/galaxy/util/__init__.py

is what worked!


some background information:
Yes, we also use a MySQL database.
All encodings (server, R/Bioconductor, used editors, ... ) are UTF-8.
The changeset (https://bitbucket.org/galaxy/galaxy-central/commits/e786022dc67ed918050bd81b9ac679ac958e4f75)
is included in our instance.

I tried the patch*, which just shortened the error output - but still the "latin-1" error.
Tried the "?charset=utf8"
Changed all table encodings (ALTER TABLE tbl_name ...) in the MySQL - it's been another UTF-8 "dialect".
Tried combinations of (almost all of) that without success.

The DEFAULT_ENCODING = 'latin-1' worked with reverted patch*
(didn't test with patch applied)

Hope this helps!

Best,
Christian



*patch:
# HG changeset patch
# User galaxy-...@invbfotv02.agresearch.co.nz
# Date 1380067994 0
#      Wed Sep 25 00:13:14 2013 +0000
# Branch agr003
# Node ID e130088e73282b3de82bd9f601b1e37508ea6a3a
# Parent  6822f41bc9bb2a2bf4673d6dcdeb1939730d970f
Fixed handling of unicode in tool stdout and stderr.
...

On 05.12.2013 23:03, John Chilton wrote:
Hmmm... can you try adding "?charset=utf8" to your database connection
string - that may fix the problem?

If not - is there a way to tell if the actual columns have changed.
Some comments on stackoverflow make it sound like the commands you
listed will only affect new columns.

Can you try the CONVERT TO CHARACTER SET.

ALTER TABLE tbl_name CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;

I don't think the problem is sqlalchemy right - this works for
postgres and sqlite I believe - it is either that MySQL cannot store
UTF-8 data in that column or there is a problem in the mysql
connector. It is not clear to me where the problem is based on your
stack trace and explanation. I would be happy to work around a
limitation in the mysql connector by adding a config option to Galaxy
if I were certain that there was a bug in the mysql connector.

-John

On Thu, Dec 5, 2013 at 12:55 PM, David Hoover <hoove...@helix.nih.gov> wrote:
John,

I stopped galaxy, then ran ALTER DATABASE galaxydb DEFAULT CHARACTER SET = 
'utf8', then ran ALTER TABLE `[table]` DEFAULT CHARACTER SET = 'utf8' on all 
the tables in galaxydb.  After starting up galaxy and rerunning the jobs (using 
the unaltered version of lib/galaxy/util/__init__.py), the job failed with the 
same error.

Can I configure the sqlalchemy connection to use utf8?  Or must I reconfigure 
the entire server to use utf8?

--David

On Dec 5, 2013, at 1:32 PM, John Chilton wrote:

Fantastic!

For this particular problem - I guess you don't strictly need to
modify more than just job and maybe task tables. I suspect at some
point there will be a non-latin-1 job parameter or history name or
username, etc... that will result in a similar problem though - so if
you could just make it all UTF-8 that would probably be ideal.

-John


On Thu, Dec 5, 2013 at 12:01 PM, David Hoover <hoove...@helix.nih.gov> wrote:
Right, nevermind, 'hg log' listed that changeset 10953:e786022dc67e.

Changing DEFAULT_ENCODING to 'latin-1' in lib/galaxy/util/__init__.py worked.

Do I need to alter ALL the MySQL tables to UTF-8, or just a selection of 
tables?  Will future updates explicitly create new tables with CHARSET=utf-8, 
or do I need to reconfigure MySQL to have a new default?

-- David

On Dec 5, 2013, at 12:32 PM, John Chilton wrote:

Actually, can you verify that this commit
https://bitbucket.org/galaxy/galaxy-central/commits/e786022dc67ed918050bd81b9ac679ac958e4f75
is in your distribution and if it is try changing:

DEFAULT_ENCODING = 'utf-8'

in lib/galaxy/util.py to

DEFAULT_ENCODING = 'latin-1'

If that works then - I can create a database_encoding_default option
in universe_wsgi.ini and let you switch it to latin-1 instead of
needing to patch Galaxy. Otherwise, setting the MySQL tables to be
UTF-8 is probably the better approach - though again - backup and test
before applying that change in production.

Hope this helps,
-John


On Thu, Dec 5, 2013 at 11:17 AM, John Chilton <chil...@msi.umn.edu> wrote:
David, Christian,

Very sorry about this - this is probably related to fixing some other
errors - 
http://dev.list.galaxyproject.org/Unicode-in-tool-stderr-crashing-galaxy-tt4661749.html#a4661750.
I will try to look into this.

Christian - what database are targeting? Is it MySQL as well?

David - do you have a test setup you can hack on? I wonder if this
would go away if you converted your tables to UTF-8.

http://stackoverflow.com/questions/6115612/how-to-convert-an-entire-mysql-database-characterset-and-collation-to-utf-8

That is not my official recommendation though - I need to do some more
research first.

-John

On Thu, Dec 5, 2013 at 11:04 AM, Christian Hundsrucker
<christian.hundsruc...@fmi.ch> wrote:
Hi David, hi all!

I have a similar/the same issue in another setting...

galaxy/galaxy_dist/lib/galaxy/jobs/runners/local.py", line 116, in queue_job
   job_wrapper.finish( stdout, stderr, exit_code )
[...]

galaxy/galaxy_dist/eggs/SQLAlchemy-0.7.9-py2.6-linux-x86_64-ucs4.egg/sqlalchemy/orm/persistence.py",
line 485, in _emit_update_statements
[...]

UnicodeEncodeError: 'latin-1' codec can't encode character u'\u2018' in
position 134: ordinal not in range(256)


I am integrating a set of R/Bioconductor modules into our local Galaxy
instance.
To do so, I use the discard_stderr_wrapper.sh.
It worked fine until the recent update*
As the error appears upon any R-output (via print, cat or error channel), I
just set the option "-v" for the cat command in the
discard_stderr_wrapper.sh:

cat $TMPFILE >&2
=>
cat -v $TMPFILE >&2


as a temporary workaround.
No idea if this is applicable in your case?!

Cheers,
Christian

*
changeset:   11219:5c789ab4144a
branch:      stable
tag:         tip




On 05.12.2013 17:29, David Hoover wrote:

I have installed the ngsplot galaxy tool from
http://code.google.com/p/ngsplot.  This tool creates a set of three pdf
files.  In older versions of Galaxy, the tool ran correctly with no
problems.  A recent update broke the tool.  The job runs but is unable to
finish.  Here is the error reported:

Traceback (most recent call last):
File "/spin1/users/galaxy/galaxy/lib/galaxy/jobs/runners/local.py", line
116, in queue_job
   job_wrapper.finish( stdout, stderr, exit_code )
File "/spin1/users/galaxy/galaxy/lib/galaxy/jobs/__init__.py", line 1015,
in finish
   self.sa_session.flush()
File "build/bdist.linux-x86_64/egg/sqlalchemy/orm/scoping.py", line 114,
in do
   return getattr(self.registry(), name)(*args, **kwargs)
File "build/bdist.linux-x86_64/egg/sqlalchemy/orm/session.py", line 1718,
in flush
   self._flush(objects)
File "build/bdist.linux-x86_64/egg/sqlalchemy/orm/session.py", line 1789,
in _flush
   flush_context.execute()
File "build/bdist.linux-x86_64/egg/sqlalchemy/orm/unitofwork.py", line
331, in execute
   rec.execute(self)
File "build/bdist.linux-x86_64/egg/sqlalchemy/orm/unitofwork.py", line
475, in execute
   uow
File "build/bdist.linux-x86_64/egg/sqlalchemy/orm/persistence.py", line
59, in save_obj
   mapper, table, update)
File "build/bdist.linux-x86_64/egg/sqlalchemy/orm/persistence.py", line
485, in _emit_update_statements
   execute(statement, params)
File "build/bdist.linux-x86_64/egg/sqlalchemy/engine/base.py", line 1449,
in execute
   params)
File "build/bdist.linux-x86_64/egg/sqlalchemy/engine/base.py", line 1584,
in _execute_clauseelement
   compiled_sql, distilled_params
File "build/bdist.linux-x86_64/egg/sqlalchemy/engine/base.py", line 1691,
in _execute_context
   context)
File "build/bdist.linux-x86_64/egg/sqlalchemy/engine/default.py", line
331, in do_execute
   cursor.execute(statement, parameters)
File "build/bdist.linux-x86_64/egg/MySQLdb/cursors.py", line 158, in
execute
   query = query % db.literal(args)
File "build/bdist.linux-x86_64/egg/MySQLdb/connections.py", line 265, in
literal
   return self.escape(o, self.encoders)
File "build/bdist.linux-x86_64/egg/MySQLdb/connections.py", line 203, in
unicode_literal
   return db.literal(u.encode(unicode_literal.charset))
UnicodeEncodeError: 'latin-1' codec can't encode character u'\ufffd' in
position 11: ordinal not in range(256)


There is a set of files created in the job_working_directory that start with
'metadata_', some of which contain the unicode.

Is there anything I can do to fix this?

David Hoover
Helix Systems Staff



___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
http://galaxyproject.org/search/mailinglists/



___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
http://galaxyproject.org/search/mailinglists/
___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
http://galaxyproject.org/search/mailinglists/

___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
 http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
 http://galaxyproject.org/search/mailinglists/

Reply via email to