[ https://issues.apache.org/jira/browse/RANGER-5125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Zbigniew Baranowski updated RANGER-5125: ---------------------------------------- Attachment: RANGER-5125.patch > Missing the result column value in ORC File Logging > --------------------------------------------------- > > Key: RANGER-5125 > URL: https://issues.apache.org/jira/browse/RANGER-5125 > Project: Ranger > Issue Type: Bug > Components: audit > Affects Versions: 2.3.0, 2.4.0, 2.5.0 > Reporter: Zbigniew Baranowski > Priority: Major > Labels: easyfix > Attachments: RANGER-5125.patch > > Original Estimate: 1h > Remaining Estimate: 1h > > h4. {*}Description{*}: > There is an issue in {{ORCFileUtil.log()}} when writing audit logs in ORC > format. The _result_ field in the audit schema is of type \{{short }}and is > not properly handled when being cast to a string. This results in empty > values in the corresponding _accessResult_ column in the ORC file. > h4. {*}Affected Component{*}: > * {{org.apache.ranger.audit.provider.ORCFileUtil}} > * {{castStringObject(Object object)}} method > h4. {*}Steps to Reproduce{*}: > # Run the main() from ORCFileUtil class: > [https://github.com/apache/ranger/blob/a90a77e1ce12a0f7193533e846c504caea293d21/agents-audit/src/main/java/org/apache/ranger/audit/utils/ORCFileUtil.java#L85] > # This will write the orc file under /tmp/test.orc > # Open the file with for example spark and read out the content, the > 'accessResult' column will not have values in any row even if the > corresponding event had it set. > {code:java} > val df =spark.read.orc("/tmp/test.orc") > df: org.apache.spark.sql.DataFrame = [repositoryType: int, repositoryName: > string ... 24 more fields] > scala> df.show(false) > 25/01/29 19:28:12 WARN package: Truncated the string representation of a plan > since it was too large. This behavior can be adjusted by setting > 'spark.sql.debug.maxToStringFields'. > +--------------+--------------+----+-------------------+----------+------------------------+------------+------+------------+-------+--------+------------+-----------+---------+----------+---------+-----------+-------------+-------+-------+------+----------+---------------+--------------+-----------+--------+ > |repositoryType|repositoryName|user|eventTime > |accessType|resourcePath > |resourceType|action|accessResult|agentId|policyId|resultReason|aclEnforcer|sessionId|clientType|clientIP > > |requestData|agentHostname|logType|eventId|seqNum|eventCount|eventDurationMS|additionalInfo|clusterName|zoneName| > +--------------+--------------+----+-------------------+----------+------------------------+------------+------+------------+-------+--------+------------+-----------+---------+----------+---------+-----------+-------------+-------+-------+------+----------+---------------+--------------+-----------+--------+ > |1 |hdfsdev | |2025-01-29 19:25:10|read > |/tmp/test-audit.log001 |file | | | |0 |1 > |ranger-acl | | |127.0.0.1| | > | |0 |0 |1 |0 | | > | | > |1 |hdfsdev | |2025-01-29 19:25:10|read > |/tmp/test-audit.log111 |file | | | |0 |1 > |ranger-acl | | |127.0.0.1| | > | |1 |0 |1 |0 | | > | | > |1 |hdfsdev | |2025-01-29 19:25:10|read > |/tmp/test-audit.log221 |file | | | |0 |1 > |ranger-acl | | |127.0.0.1| | > | |2 |0 |1 |0 | | > | | > |1 |hdfsdev | |2025-01-29 19:25:10|read > |/tmp/test-audit.log331 |file | | | |0 |1 > |ranger-acl | | |127.0.0.1| | > | |3 |0 |1 |0 | | > | | > |1 |hdfsdev | |2025-01-29 19:25:10|read > |/tmp/test-audit.log441 |file | | | |0 |1 > |ranger-acl | | |127.0.0.1| | > | |4 |0 |1 |0 | | > | | > |1 |hdfsdev | |2025-01-29 19:25:10|read > |/tmp/test-audit.log551 |file | | | |0 |1 > |ranger-acl | | |127.0.0.1| | > | |5 |0 |1 |0 | | > | | > |1 |hdfsdev | |2025-01-29 19:25:10|read > |/tmp/test-audit.log661 |file | | | |0 |1 > |ranger-acl | | |127.0.0.1| | > | |6 |0 |1 |0 | | > | | > |1 |hdfsdev | |2025-01-29 19:25:10|read > |/tmp/test-audit.log771 |file | | | |0 |1 > |ranger-acl | | |127.0.0.1| | > | |7 |0 |1 |0 | | > | | > |1 |hdfsdev | |2025-01-29 19:25:10|read > |/tmp/test-audit.log881 |file | | | |0 |1 > |ranger-acl | | |127.0.0.1| | > | |8 |0 |1 |0 | | > | | > |1 |hdfsdev | |2025-01-29 19:25:10|read > |/tmp/test-audit.log991 |file | | | |0 |1 > |ranger-acl | | |127.0.0.1| | > | |9 |0 |1 |0 | | > | | > |1 |hdfsdev | |2025-01-29 19:25:10|read > |/tmp/test-audit.log10101|file | | | |0 |1 > |ranger-acl | | |127.0.0.1| | > | |10 |0 |1 |0 | | > | | > |1 |hdfsdev | |2025-01-29 19:25:10|read > |/tmp/test-audit.log11111|file | | | |0 |1 > |ranger-acl | | |127.0.0.1| | > | |11 |0 |1 |0 | | > | | > |1 |hdfsdev | |2025-01-29 19:25:10|read > |/tmp/test-audit.log12121|file | | | |0 |1 > |ranger-acl | | |127.0.0.1| | > | |12 |0 |1 |0 | | > | | > |1 |hdfsdev | |2025-01-29 19:25:10|read > |/tmp/test-audit.log13131|file | | | |0 |1 > |ranger-acl | | |127.0.0.1| | > | |13 |0 |1 |0 | | > | | > |1 |hdfsdev | |2025-01-29 19:25:10|read > |/tmp/test-audit.log14141|file | | | |0 |1 > |ranger-acl | | |127.0.0.1| | > | |14 |0 |1 |0 | | > | | > |1 |hdfsdev | |2025-01-29 19:25:10|read > |/tmp/test-audit.log15151|file | | | |0 |1 > |ranger-acl | | |127.0.0.1| | > | |15 |0 |1 |0 | | > | | > |1 |hdfsdev | |2025-01-29 19:25:10|read > |/tmp/test-audit.log16161|file | | | |0 |1 > |ranger-acl | | |127.0.0.1| | > | |16 |0 |1 |0 | | > | | > |1 |hdfsdev | |2025-01-29 19:25:10|read > |/tmp/test-audit.log17171|file | | | |0 |1 > |ranger-acl | | |127.0.0.1| | > | |17 |0 |1 |0 | | > | | > |1 |hdfsdev | |2025-01-29 19:25:10|read > |/tmp/test-audit.log18181|file | | | |0 |1 > |ranger-acl | | |127.0.0.1| | > | |18 |0 |1 |0 | | > | | > |1 |hdfsdev | |2025-01-29 19:25:10|read > |/tmp/test-audit.log19191|file | | | |0 |1 > |ranger-acl | | |127.0.0.1| | > | |19 |0 |1 |0 | | > | | > +--------------+--------------+----+-------------------+----------+------------------------+------------+------+------------+-------+--------+------------+-----------+---------+----------+---------+-----------+-------------+-------+-------+------+----------+---------------+--------------+-----------+--------+ > {code} > *Expected Behavior:* > * {{short}} values (result field) will be correctly converted to strings > before writing to ORC. > h4. {*}Root Cause{*}: > * The {{castStringObject(Object object)}} method is missing a case for > {{{}Short{}}}. > * This results in {{null}} or incorrect conversions when a {{short}} value > is written to ORC. > h4. {*}Proposed Fix{*}: > Modify {{castStringObject(Object object)}} in {{ORCFileUtil.java}} to > properly handle {{Short}} values: > {code:java} > protected String castStringObject(Object object) { > String ret = null; > try { > if (object instanceof String) > ret = (String) object; > else if (object instanceof Date) { > ret = getDateString((Date) object); > } > else if (object instanceof Short) { // Fix: Added case for Short > ret = ((Short) object).toString(); > } > } catch (Exception e) { > logger.error("Error while writing into ORC File:", e); > } > return ret; > } {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)