[
https://issues.apache.org/jira/browse/ATLAS-3953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17201000#comment-17201000
]
Ashutosh Mestry commented on ATLAS-3953:
----------------------------------------
I have attached couple of things:
* _path.zip_: Exported zip file that contains entity with special characters
in it.
* _a5c...json_: Entity represented within Atlas.
I was not able to duplicate the problem.
I also have a patch that specifies _UTF-8_ encoding when writing string to the
ZIP stream. Would you be able to try out the patch?
> JSON Files from Export API with "?" char for string with special chars
> -----------------------------------------------------------------------
>
> Key: ATLAS-3953
> URL: https://issues.apache.org/jira/browse/ATLAS-3953
> Project: Atlas
> Issue Type: Bug
> Components: atlas-core
> Affects Versions: 2.1.0
> Environment: Apache Atlas 2.1.0 embedded HBASE and SOLR
> Reporter: Carlos Alberto Rocha Cardoso
> Assignee: Ashutosh Mestry
> Priority: Major
> Attachments: 9fdc3ad0-46c2-430a-89c4-4a751d31c064.json,
> a5c148bf-5ab6-4c49-853e-855842102128.json, path.zip
>
>
> The Export API return a ZIP file with some JSON files describing Atlas
> Entities and TypeDefs.
> I am having a issue where some special chars in JSON are being replaced by
> "?" chars.
> An Entity name like "Distribuição" was exported in JSON file like
> "Distribui??o". The special chars "çã" was replaced for the "??" chars.
> I tried to change the exported JSON file encoding and the request header for
> Export API, but without success.
> After analyzing the Atlas source code, specialy the *splitAndWriteBytes*
> method of the
> *[ZipSink|https://github.com/apache/atlas/blob/cc601d7371fae1dbc16b55d1ca84f06b745700dc/repository/src/main/java/org/apache/atlas/repository/impexp/ZipSink.java]
> class*, I tought if maybe the problem is because the *s.getBytes()* is
> returning the JSON string to be written to ZIP with other encode than
> *UTF-8*, and maybe set the encode like *s.getBytes(StandardCharsets.UTF_8)*
> could be a solution.
> Its my first contact with the Atlas source code, and I'm not a JAVA
> programmer, so it's only a guess.
> I saw that it's possible set the default encode to the plataform, or JVM, but
> how they said in this below discussion, perhaps this doesn't work properly in
> all situations.
> [https://stackoverflow.com/questions/361975/setting-the-default-java-character-encoding]
--
This message was sent by Atlassian Jira
(v8.3.4#803005)