[ 
https://issues.apache.org/jira/browse/RANGER-5125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zbigniew Baranowski updated RANGER-5125:
----------------------------------------
    Attachment: RANGER-5125.patch

> Missing the result column value in ORC File Logging
> ---------------------------------------------------
>
>                 Key: RANGER-5125
>                 URL: https://issues.apache.org/jira/browse/RANGER-5125
>             Project: Ranger
>          Issue Type: Bug
>          Components: audit
>    Affects Versions: 2.3.0, 2.4.0, 2.5.0
>            Reporter: Zbigniew Baranowski
>            Priority: Major
>              Labels: easyfix
>         Attachments: RANGER-5125.patch
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> h4. {*}Description{*}:
> There is an issue in {{ORCFileUtil.log()}} when writing audit logs in ORC 
> format. The _result_ field in the audit schema is of type \{{short }}and is 
> not properly handled when being cast to a string. This results in empty 
> values in the corresponding _accessResult_ column in the ORC file.
> h4. {*}Affected Component{*}:
>  * {{org.apache.ranger.audit.provider.ORCFileUtil}}
>  * {{castStringObject(Object object)}} method
> h4. {*}Steps to Reproduce{*}:
>  # Run the main() from ORCFileUtil class:  
> [https://github.com/apache/ranger/blob/a90a77e1ce12a0f7193533e846c504caea293d21/agents-audit/src/main/java/org/apache/ranger/audit/utils/ORCFileUtil.java#L85]
>  # This will write the orc file under /tmp/test.orc
>  # Open the file with for example spark and read out the content, the 
> 'accessResult' column will not have values in any row even if the 
> corresponding event had it set.
> {code:java}
> val df =spark.read.orc("/tmp/test.orc")
> df: org.apache.spark.sql.DataFrame = [repositoryType: int, repositoryName: 
> string ... 24 more fields]
> scala> df.show(false)
> 25/01/29 19:28:12 WARN package: Truncated the string representation of a plan 
> since it was too large. This behavior can be adjusted by setting 
> 'spark.sql.debug.maxToStringFields'.
> +--------------+--------------+----+-------------------+----------+------------------------+------------+------+------------+-------+--------+------------+-----------+---------+----------+---------+-----------+-------------+-------+-------+------+----------+---------------+--------------+-----------+--------+
> |repositoryType|repositoryName|user|eventTime          
> |accessType|resourcePath            
> |resourceType|action|accessResult|agentId|policyId|resultReason|aclEnforcer|sessionId|clientType|clientIP
>  
> |requestData|agentHostname|logType|eventId|seqNum|eventCount|eventDurationMS|additionalInfo|clusterName|zoneName|
> +--------------+--------------+----+-------------------+----------+------------------------+------------+------+------------+-------+--------+------------+-----------+---------+----------+---------+-----------+-------------+-------+-------+------+----------+---------------+--------------+-----------+--------+
> |1             |hdfsdev       |    |2025-01-29 19:25:10|read      
> |/tmp/test-audit.log001  |file        |      |            |       |0       |1 
>           |ranger-acl |         |          |127.0.0.1|           |            
>  |       |0      |0     |1         |0              |              |           
> |        |
> |1             |hdfsdev       |    |2025-01-29 19:25:10|read      
> |/tmp/test-audit.log111  |file        |      |            |       |0       |1 
>           |ranger-acl |         |          |127.0.0.1|           |            
>  |       |1      |0     |1         |0              |              |           
> |        |
> |1             |hdfsdev       |    |2025-01-29 19:25:10|read      
> |/tmp/test-audit.log221  |file        |      |            |       |0       |1 
>           |ranger-acl |         |          |127.0.0.1|           |            
>  |       |2      |0     |1         |0              |              |           
> |        |
> |1             |hdfsdev       |    |2025-01-29 19:25:10|read      
> |/tmp/test-audit.log331  |file        |      |            |       |0       |1 
>           |ranger-acl |         |          |127.0.0.1|           |            
>  |       |3      |0     |1         |0              |              |           
> |        |
> |1             |hdfsdev       |    |2025-01-29 19:25:10|read      
> |/tmp/test-audit.log441  |file        |      |            |       |0       |1 
>           |ranger-acl |         |          |127.0.0.1|           |            
>  |       |4      |0     |1         |0              |              |           
> |        |
> |1             |hdfsdev       |    |2025-01-29 19:25:10|read      
> |/tmp/test-audit.log551  |file        |      |            |       |0       |1 
>           |ranger-acl |         |          |127.0.0.1|           |            
>  |       |5      |0     |1         |0              |              |           
> |        |
> |1             |hdfsdev       |    |2025-01-29 19:25:10|read      
> |/tmp/test-audit.log661  |file        |      |            |       |0       |1 
>           |ranger-acl |         |          |127.0.0.1|           |            
>  |       |6      |0     |1         |0              |              |           
> |        |
> |1             |hdfsdev       |    |2025-01-29 19:25:10|read      
> |/tmp/test-audit.log771  |file        |      |            |       |0       |1 
>           |ranger-acl |         |          |127.0.0.1|           |            
>  |       |7      |0     |1         |0              |              |           
> |        |
> |1             |hdfsdev       |    |2025-01-29 19:25:10|read      
> |/tmp/test-audit.log881  |file        |      |            |       |0       |1 
>           |ranger-acl |         |          |127.0.0.1|           |            
>  |       |8      |0     |1         |0              |              |           
> |        |
> |1             |hdfsdev       |    |2025-01-29 19:25:10|read      
> |/tmp/test-audit.log991  |file        |      |            |       |0       |1 
>           |ranger-acl |         |          |127.0.0.1|           |            
>  |       |9      |0     |1         |0              |              |           
> |        |
> |1             |hdfsdev       |    |2025-01-29 19:25:10|read      
> |/tmp/test-audit.log10101|file        |      |            |       |0       |1 
>           |ranger-acl |         |          |127.0.0.1|           |            
>  |       |10     |0     |1         |0              |              |           
> |        |
> |1             |hdfsdev       |    |2025-01-29 19:25:10|read      
> |/tmp/test-audit.log11111|file        |      |            |       |0       |1 
>           |ranger-acl |         |          |127.0.0.1|           |            
>  |       |11     |0     |1         |0              |              |           
> |        |
> |1             |hdfsdev       |    |2025-01-29 19:25:10|read      
> |/tmp/test-audit.log12121|file        |      |            |       |0       |1 
>           |ranger-acl |         |          |127.0.0.1|           |            
>  |       |12     |0     |1         |0              |              |           
> |        |
> |1             |hdfsdev       |    |2025-01-29 19:25:10|read      
> |/tmp/test-audit.log13131|file        |      |            |       |0       |1 
>           |ranger-acl |         |          |127.0.0.1|           |            
>  |       |13     |0     |1         |0              |              |           
> |        |
> |1             |hdfsdev       |    |2025-01-29 19:25:10|read      
> |/tmp/test-audit.log14141|file        |      |            |       |0       |1 
>           |ranger-acl |         |          |127.0.0.1|           |            
>  |       |14     |0     |1         |0              |              |           
> |        |
> |1             |hdfsdev       |    |2025-01-29 19:25:10|read      
> |/tmp/test-audit.log15151|file        |      |            |       |0       |1 
>           |ranger-acl |         |          |127.0.0.1|           |            
>  |       |15     |0     |1         |0              |              |           
> |        |
> |1             |hdfsdev       |    |2025-01-29 19:25:10|read      
> |/tmp/test-audit.log16161|file        |      |            |       |0       |1 
>           |ranger-acl |         |          |127.0.0.1|           |            
>  |       |16     |0     |1         |0              |              |           
> |        |
> |1             |hdfsdev       |    |2025-01-29 19:25:10|read      
> |/tmp/test-audit.log17171|file        |      |            |       |0       |1 
>           |ranger-acl |         |          |127.0.0.1|           |            
>  |       |17     |0     |1         |0              |              |           
> |        |
> |1             |hdfsdev       |    |2025-01-29 19:25:10|read      
> |/tmp/test-audit.log18181|file        |      |            |       |0       |1 
>           |ranger-acl |         |          |127.0.0.1|           |            
>  |       |18     |0     |1         |0              |              |           
> |        |
> |1             |hdfsdev       |    |2025-01-29 19:25:10|read      
> |/tmp/test-audit.log19191|file        |      |            |       |0       |1 
>           |ranger-acl |         |          |127.0.0.1|           |            
>  |       |19     |0     |1         |0              |              |           
> |        |
> +--------------+--------------+----+-------------------+----------+------------------------+------------+------+------------+-------+--------+------------+-----------+---------+----------+---------+-----------+-------------+-------+-------+------+----------+---------------+--------------+-----------+--------+
> {code}
> *Expected Behavior:*
>  * {{short}} values (result field) will be correctly converted to strings 
> before writing to ORC.
> h4. {*}Root Cause{*}:
>  * The {{castStringObject(Object object)}} method is missing a case for 
> {{{}Short{}}}.
>  * This results in {{null}} or incorrect conversions when a {{short}} value 
> is written to ORC.
> h4. {*}Proposed Fix{*}:
> Modify {{castStringObject(Object object)}} in {{ORCFileUtil.java}} to 
> properly handle {{Short}} values:
> {code:java}
> protected String castStringObject(Object object) {
>     String ret = null;
>     try {
>         if (object instanceof String)
>             ret = (String) object;
>         else if (object instanceof Date) {
>             ret = getDateString((Date) object);
>         }
>         else if (object instanceof Short) {  // Fix: Added case for Short
>             ret = ((Short) object).toString();
>         }
>     } catch (Exception e) {
>         logger.error("Error while writing into ORC File:", e);
>     }
>     return ret;
> } {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to