[
https://issues.apache.org/jira/browse/SAMZA-700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14944823#comment-14944823
]
Tobias Roth commented on SAMZA-700:
-----------------------------------
As mentioned above the problem is due to Util.envVarEscape(), which simply
escapes single and double quotes globally. The result is passed to shell using
weak quoting (see for example
https://www.gnu.org/software/bash/manual/html_node/Double-Quotes.html).
To fix this issue the value of SAMZA_CONFIG should be escaped by
{code:java}
value.replace("\\", "\\\\").replace("`", "\\`").replace("$",
"\\$").replace("\"", "\\\"").replace("!", "\"'!'\"")
{code}
I'm not sure what's the best approach here. Change the behavior of
Util.envVarEscape() or define a now function Util.envVarWeakQuotingBashEscape()
and add a configuration parameter for selecting the strategy that should be
used?
> YarnJob mangles config properties containing quotes
> ---------------------------------------------------
>
> Key: SAMZA-700
> URL: https://issues.apache.org/jira/browse/SAMZA-700
> Project: Samza
> Issue Type: Bug
> Affects Versions: 0.8.0, 0.9.0
> Reporter: Tommy Becker
>
> YarnJob passes the Config to the AM via an environment variable,
> SAMZA_CONFIG. After serializing the Config to JSON, it goes through
> Util.envVarEscape(), which I think is behaving improperly. Specifically,
> that method escapes single quotes globally, even inside double quotes.
> Consider the following config property:
> {code:javascript}
> expression="type == 'LINEAR'"
> {code}
> After encoding to JSON this looks like this:
> {code:javascript}
> {"expression":"type == 'LINEAR'"}
> {code}
> And after being run though Util.envVarEscape():
> {code:javascript}
> {\"expression\":\"type == \'LINEAR\'\"}
> {code}
> I presume these values are being escaped because the YARN client is passing
> them through the shell at some point. But the escaping is too simplistic;
> single quotes should not be escaped within double quotes. As a result, the
> value arrives at the AppMaster as follows:
> {code:javascript}
> {"expression": "type == \'LINEAR\'"}
> {code}
> At which point Jackson chokes on it because \' is invalid JSON (invalid
> escape sequence):
> {noformat}
> Exception in thread "main" org.codehaus.jackson.JsonParseException:
> Unrecognized character escape ''' (code 39)
> at [Source: java.io.StringReader@1b6e1eff; line: 1, column: 2814]
> at
> org.codehaus.jackson.JsonParser._constructError(JsonParser.java:1433)
> at
> org.codehaus.jackson.impl.JsonParserMinimalBase._reportError(JsonParserMinimalBase.java:521)
> at
> org.codehaus.jackson.impl.JsonParserMinimalBase._handleUnrecognizedCharacterEscape(JsonParserMinimalBase.java:496)
> at
> org.codehaus.jackson.impl.ReaderBasedParser._decodeEscaped(ReaderBasedParser.java:1606)
> at
> org.codehaus.jackson.impl.ReaderBasedParser._finishString2(ReaderBasedParser.java:1353)
> at
> org.codehaus.jackson.impl.ReaderBasedParser._finishString(ReaderBasedParser.java:1330)
> at
> org.codehaus.jackson.impl.ReaderBasedParser.getText(ReaderBasedParser.java:200)
> at
> org.codehaus.jackson.map.deser.std.UntypedObjectDeserializer.deserialize(UntypedObjectDeserializer.java:59)
> at
> org.codehaus.jackson.map.deser.std.MapDeserializer._readAndBind(MapDeserializer.java:319)
> at
> org.codehaus.jackson.map.deser.std.MapDeserializer.deserialize(MapDeserializer.java:249)
> at
> org.codehaus.jackson.map.deser.std.MapDeserializer.deserialize(MapDeserializer.java:33)
> at
> org.codehaus.jackson.map.ObjectMapper._readMapAndClose(ObjectMapper.java:2732)
> at
> org.codehaus.jackson.map.ObjectMapper.readValue(ObjectMapper.java:1863)
> at
> org.apache.samza.config.serializers.JsonConfigSerializer$.fromJson(JsonConfigSerializer.scala:34)
> at
> org.apache.samza.job.yarn.SamzaAppMaster$.main(SamzaAppMaster.scala:72)
> at org.apache.samza.job.yarn.SamzaAppMaster.main(SamzaAppMaster.scala)
> {noformat}
> This is particularly nasty since I don't see a way for any quotes, single or
> double to get passed to the job successfully and remain intact. I know the
> way this config is passed has undergone some change but I don't know the
> details so wanted to get this issue on record.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)