-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63552/#review222758
-----------------------------------------------------------


Ship it!




Ship It!

- Velmurugan Periasamy


On March 31, 2021, 6:01 p.m., Ramesh Mani wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/63552/
> -----------------------------------------------------------
> 
> (Updated March 31, 2021, 6:01 p.m.)
> 
> 
> Review request for ranger, Don Bosco Durai, Abhay Kulkarni, Madhan Neethiraj, 
> Mehul Parikh, Selvamohan Neethiraj, Sailaja Polavarapu, and Velmurugan 
> Periasamy.
> 
> 
> Bugs: RANGER-1837
>     https://issues.apache.org/jira/browse/RANGER-1837
> 
> 
> Repository: ranger
> 
> 
> Description
> -------
> 
> RANGER-1837:Enhance Ranger Audit to HDFS to support ORC file format
> 
> 
> Diffs
> -----
> 
>   agents-audit/pom.xml b9f6af27c 
>   
> agents-audit/src/main/java/org/apache/ranger/audit/destination/HDFSAuditDestination.java
>  5e6f40226 
>   
> agents-audit/src/main/java/org/apache/ranger/audit/provider/AuditHandler.java 
> 4ce31dd09 
>   
> agents-audit/src/main/java/org/apache/ranger/audit/provider/AuditProviderFactory.java
>  6b7f4b00b 
>   
> agents-audit/src/main/java/org/apache/ranger/audit/provider/AuditWriterFactory.java
>  PRE-CREATION 
>   
> agents-audit/src/main/java/org/apache/ranger/audit/provider/BaseAuditHandler.java
>  54f37644b 
>   
> agents-audit/src/main/java/org/apache/ranger/audit/provider/DummyAuditProvider.java
>  05f882ff3 
>   agents-audit/src/main/java/org/apache/ranger/audit/provider/MiscUtil.java 
> e2b74489b 
>   
> agents-audit/src/main/java/org/apache/ranger/audit/provider/MultiDestAuditProvider.java
>  282f5abfa 
>   
> agents-audit/src/main/java/org/apache/ranger/audit/queue/AuditFileCacheProviderSpool.java
>  41513ba40 
>   
> agents-audit/src/main/java/org/apache/ranger/audit/queue/AuditFileQueue.java 
> PRE-CREATION 
>   
> agents-audit/src/main/java/org/apache/ranger/audit/queue/AuditFileQueueSpool.java
>  PRE-CREATION 
>   
> agents-audit/src/main/java/org/apache/ranger/audit/utils/AbstractRangerAuditWriter.java
>  PRE-CREATION 
>   agents-audit/src/main/java/org/apache/ranger/audit/utils/ORCFileUtil.java 
> PRE-CREATION 
>   
> agents-audit/src/main/java/org/apache/ranger/audit/utils/RangerAuditWriter.java
>  PRE-CREATION 
>   
> agents-audit/src/main/java/org/apache/ranger/audit/utils/RangerJSONAuditWriter.java
>  PRE-CREATION 
>   
> agents-audit/src/main/java/org/apache/ranger/audit/utils/RangerORCAuditWriter.java
>  PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/63552/diff/7/
> 
> 
> Testing
> -------
> 
> Testing done in local
> 
> ORC FILE FORMAT in HDFS Ranger Audit log with local audit file store as 
> source for HDFS audit:
>       NOTE: When this is done each records in the local file will be read for 
> creating the ORC File.
> 
>     1. Enable Ranger Audit to HDFS in ORC file format using AuditFileQueue
>         - To enable Ranger Audit to HDFS with ORC format, we need to first 
> enable AuditFileQueue to spool the audit to local first.
>             * In Namenode host, create spool directory and make sure the path 
> can be read/write/execute for owner of the Service for which Ranger plugin is 
> enabled ( e.g HDFS Service it is hdfs:hadoop, Hive Service it is hive:hadoop 
> ..etc)
> 
>                 $ mkdir -p  /var/log/hadoop/audit/staging/spool
>                 $ cd /var/log/hadoop/audit/staging/spool
>                 $ chown hdfs:hadoop spool
> 
>             * Enable AuditFileQueue via following params in 
> ranger-<component>-audit.xml
>                xasecure.audit.destination.hdfs.batch.queuetype=filequeue 
> (NOTE: default = memqueue which is the behaviour where a  memory queue / 
> buffer is used  instead of Local File buffer)
>                          
> xasecure.audit.destination.hdfs.batch.filequeue.filespool.file.rollover.sec=300
>     ( This will determine the batch size for ORC file which is created)
>                
> xasecure.audit.destination.hdfs.batch.filequeue.filespool.dir=/var/log/hadoop/audit/staging/spool
>   ( This is the local staging directory for audit)
>                
> xasecure.audit.destination.hdfs.batch.filequeue.filespool.buffer.size=10000  
> ( This will determine batch size for ORC file creation alone with 
> rollover.sec parameter)
> 
>     2. Enable ORC fileformat for Ranger HDFS Audit.
>           - This is done by having the following param in 
> ranger-<component>-audit.xml. By default the value is "json"
> 
>             xasecure.audit.destination.hdfs.filetype=orc ( default = json )
> 
>     3. Provision to control the compression techniques for ORC format. 
> Default is 'snappy'
>             
> xasecure.audit.destination.hdfs.orc.compression=snappy|lzo|zlip|none
> 
>     4. Buffer Size and Stripe Size of ORC file batch. Default is '10000' 
> bytes and '100000' bytes respectively. This will decide the batch size on ORC 
> file in hdfs.
>             xasecure.audit.destination.hdfs.orc.buffersize= (value in bytes)
>             xasecure.audit.destination.hdfs.orc.stripesize= (value in bytes)
> 
>     5. Hive Query to create ORC table with default 'snappy' compresssion.
> 
>         CREATE EXTERNAL TABLE ranger_audit_event (
>         repositoryType int,
>         repositoryName string,
>         reqUser string,
>         evtTime string,
>         accessType string,
>         resourcePath string,
>         resourceType string,
>         action  string,
>         accessResult string,
>         agentId string,
>         policyId  bigint,
>         resultReason string,
>         aclEnforcer string,
>         sessionId string,
>         clientType string,
>         clientIP string,
>         requestData string,
>         clusterName string
>         )
>         STORED AS ORC
>         LOCATION '/ranger/audit/hdfs'
>         TBLPROPERTIES  ("orc.compress"="SNAPPY");
> 
> 
> -------------------------
> 
> JSON FILE FORMAT in HDFS Ranger Audit log with local audit file store as 
> source for HDFS audit:
>       NOTE: When this is done each local file will be copied entirely into 
> HDFS destination. This enables us to generate Ranger audit files in HDFS 
> which are larger in size which is a preferred.
>       
>        1. Enable Ranger Audit to HDFS in JSON file format using AuditFileQueue
>         - To enable Ranger Audit to HDFS with JSON format and local file 
> cached, we need to first enable AuditFileQueue to spool the audit to locally.
> 
>             * In Namenode host, create spool directory and make sure the path 
> can be read/write/execute for owner of the Service for which Ranger plugin is 
> enabled (e.g HDFS Service it is hdfs:hadoop, Hive Service it is hive:hadoop 
> ..etc)
> 
>                 $ mkdir -p  /var/log/hadoop/audit/staging/spool
>                 $ cd /var/log/hadoop/audit/staging/spool
>                 $ chown hdfs:hadoop spool
> 
>             * Enable AuditFileQueue via following params in 
> ranger-<component>-audit.xml
>                xasecure.audit.destination.hdfs.batch.queuetype=filequeue ( 
> NOTE: default = memqueue which is the behaviour where a  memory queue / 
> buffer is used  instead of Local File buffer)
>                          
> xasecure.audit.destination.hdfs.batch.filequeue.filespool.file.rollover.sec=300
>     ( This will determine the JSON file size which will be copied to HDFS)
>                
> xasecure.audit.destination.hdfs.batch.filequeue.filespool.dir=/var/log/hadoop/audit/staging/spool
>   ( This is the local staging directory for audit)
> 
> 
> Thanks,
> 
> Ramesh Mani
> 
>

Reply via email to