We are running Oozie 4.0.0 (via CDH 5.3.2 with YARN), and we have a weird thing going on. When we run workflows, they appear to be changing the default Character set..and not sure why. When we run a simple Java App, with the line below: System.out.println(Charset.defaultCharset()); >From our test code, we did the simple above command, and we get: 2015-03-05 19:01:05,623 INFO [main] com.test.encoding.Test: US-ASCII Just running a shell script with "locale" as the only thing also returns the POSIX:
Oozie Launcher, capturing output data: ======================= LANG= LC_CTYPE="POSIX" LC_NUMERIC="POSIX" LC_TIME="POSIX" LC_COLLATE="POSIX" LC_MONETARY="POSIX" LC_MESSAGES="POSIX" LC_PAPER="POSIX" LC_NAME="POSIX" LC_ADDRESS="POSIX" LC_TELEPHONE="POSIX" LC_MEASUREMENT="POSIX" LC_IDENTIFICATION="POSIX" LC_ALL= even though all when running locale in a bash shell...the nodes have the UTF-8: LANG=en_US.UTF-8 LC_CTYPE="en_US.UTF-8" LC_NUMERIC="en_US.UTF-8" LC_TIME="en_US.UTF-8" LC_COLLATE="en_US.UTF-8" LC_MONETARY="en_US.UTF-8" LC_MESSAGES="en_US.UTF-8" LC_PAPER="en_US.UTF-8" LC_NAME="en_US.UTF-8" LC_ADDRESS="en_US.UTF-8" LC_TELEPHONE="en_US.UTF-8" LC_MEASUREMENT="en_US.UTF-8" LC_IDENTIFICATION="en_US.UTF-8" LC_ALL= But...when we look at the various settings on the box (JVM, locale, etc)...they all point to UTF-8. In the ooze-env.sh we set: setting LC_ALL=en_US.UTF-8 setting LANG=en_US.UTF-8 setting LANGUAGE=en_US.UTF-8 just to make sure things get setup...but no success. Basically, we can't figure out how to have Oozie do UTF-8, and not ASCII/POSIX. We are backed by a MySQL DB, with the default char set to UTF-8 as well. Any thoughts/suggestion, places to read/look? Thanks in advance! Cheers, Aaron
