[ https://issues.apache.org/jira/browse/HIVE-3301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13422787#comment-13422787 ]
Zhenxiao Luo commented on HIVE-3301: ------------------------------------ The problem is: In hadoop23, TaskLogServlet.java is using a new utility HtmlQuoting.java to print Task Log. In TaskLogServlet.java, printTaskLog() function: result = taskLogReader.read(b); if (result > 0) { if (plainText) { out.write(b, 0, result); } else { HtmlQuoting.quoteHtmlChars(out, b, 0, result); } } else { break; } While, in hadoop20, TaskLogServlet.java is using its own utility(there is no such HtmlQuoting.java at all) to print Task Log: In TaskLogServlet.java, printTaskLog fucntion: result = taskLogReader.read(b); if (result > 0) { if (plainText) { out.write(b, 0, result); } else { quotedWrite(out, b, 0, result); } } else { break; } And in Hive, TaskLogProcessor.java is generating stack trace by reading the raw taskAttemptLog. In ql/src/java/org/apache/hadoop/hive/ql/exec/errors/TaskLogProcessor.java, getStackTraces() fuction: List<String> stackTrace = null; // Patterns that match the middle/end of stack traces Pattern stackTracePattern = Pattern.compile("^\tat .*", Pattern.CASE_INSENSITIVE); Pattern endStackTracePattern = Pattern.compile("^\t... [0-9]+ more.*", Pattern.CASE_INSENSITIVE); while ((inputLine = in.readLine()) != null) { if (stackTracePattern.matcher(inputLine).matches() || endStackTracePattern.matcher(inputLine).matches()) { To have Hive working for both hadoop20 and hadoop23, we should use different mechanisms when hive TaskLogProcessor is parsing TaskAttemptLog. My plan is creating a shim, which have different implementations for hadoop20 and hadoop23. In hadoop23, HtmlQuoting.unquoteHtmlChars() is used to parse the TaskAttemptLog. > Fix quote printing bug in mapreduce_stack_trace.q testcase failure when > running hive on hadoop23 > ------------------------------------------------------------------------------------------------ > > Key: HIVE-3301 > URL: https://issues.apache.org/jira/browse/HIVE-3301 > Project: Hive > Issue Type: Bug > Reporter: Zhenxiao Luo > Assignee: Zhenxiao Luo > > When running hive on hadoop0.23, mapreduce_stack_trace.q is failing due to > quote printing bug: > quote is printed as: '"', instead of " > Seems not able to state the bug clearly in html: > quote is printed as 'address sign' + 'quot' + semicolon > not the expected 'quote sign' -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira