Ramavtar,

On 2/20/25 6:46 AM, Ramavtar Pareek wrote:
I am facing an issue where some specific keys in my API response
contain non-printable
characters instead of the expected Hindi characters. The issue occurs in
our production environment, which has the following architecture:
System Flow:

    1.

    A Varnish server receives the request.
    2.

    Varnish forwards the request to our Ensemble API (hosted on Tomcat).
    3.

    Ensemble API calls Core API, which returns a response.
    4.

    Ensemble API processes the response and sends it back to Tomcat.
    5.

    Tomcat returns the final response to Varnish, which then sends it to the
    client.

Issue Observed:

    -

    Some keys in the Ensemble API’s response contain non-printable characters
    instead of Hindi text.
    -

    This is not happening for all Hindi characters, only for some specific
    keys.
    -

    The issue is not reproducible in local environments, only occurring
in production
    servers.

What Has Been Configured:

Tomcat Response Settings in Ensemble API:
res.setContentType("text/html");

res.setCharacterEncoding("UTF-8");


Tomcat setenv.sh Configuration:

export CATALINA_OPTS="$CATALINA_OPTS -Dfile.encoding=UTF-8"

Note that file.encoding doesn't change how Tomcat encodes any responses. It may change how the JVM reads files where no specific character encoding has been specified by the code reading the file.

Core API is correctly returning Hindi characters when tested independently.

What do you mean "Core API"? Is this your Java-based code responding to API requests from Tomcat, or is this confirmed using debugging/logging within your own application?

No issues found in JSON serialization (Jackson) when logging the response
in Ensemble API before sending it.
Possible Causes & Questions:

    -

    Could Tomcat's encoding settings still be affecting this, even though
    file.encoding=UTF-8 is set?

Unlikely.

    The only missing thing in Production Server tomcat is *URIEncoding="UTF-8"
    /> in server.xml file*. But it is only applicable to correctly
    encode/decode input params. Can it affect the response too?

This will only affect the character encoding when reading request parameters from a URL. Note that some web browsers do not provide a Content-Type when sending HTTP POST parameters, and the URIEncoding can be used as a default for this.

    Could there be an issue with how Jackson serializes the JSON, causing
    certain Hindi characters to break?

Unlikely, but possible.

    Could the underlying OS locale affect this behavior? (Checked with locale
    but didn’t find any obvious issues.The locale of both stage and production
    servers is LANG=en_US.UTF-8)

Unlikely.

    Are there specific headers we need to check to ensure UTF-8 is
    maintained throughout the request/response cycle?

You should first verify that your services through Tomcat are working (or not) as you expect. Make your API calls directly to Tomcat without Varnish in the mix and report back.

I would really appreciate any insights into what might be causing this
issue. Please let me know if there are any specific logs, tests, or
additional configurations I should check.

What version of Tomcat are you using?

It looks like you may already have read this, but I'll post it here anyway just in case you haven't seen it:
https://cwiki.apache.org/confluence/display/TOMCAT/Character+Encoding

-chris


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org

Reply via email to