----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/63552/#review222758 -----------------------------------------------------------
Ship it! Ship It! - Velmurugan Periasamy On March 31, 2021, 6:01 p.m., Ramesh Mani wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/63552/ > ----------------------------------------------------------- > > (Updated March 31, 2021, 6:01 p.m.) > > > Review request for ranger, Don Bosco Durai, Abhay Kulkarni, Madhan Neethiraj, > Mehul Parikh, Selvamohan Neethiraj, Sailaja Polavarapu, and Velmurugan > Periasamy. > > > Bugs: RANGER-1837 > https://issues.apache.org/jira/browse/RANGER-1837 > > > Repository: ranger > > > Description > ------- > > RANGER-1837:Enhance Ranger Audit to HDFS to support ORC file format > > > Diffs > ----- > > agents-audit/pom.xml b9f6af27c > > agents-audit/src/main/java/org/apache/ranger/audit/destination/HDFSAuditDestination.java > 5e6f40226 > > agents-audit/src/main/java/org/apache/ranger/audit/provider/AuditHandler.java > 4ce31dd09 > > agents-audit/src/main/java/org/apache/ranger/audit/provider/AuditProviderFactory.java > 6b7f4b00b > > agents-audit/src/main/java/org/apache/ranger/audit/provider/AuditWriterFactory.java > PRE-CREATION > > agents-audit/src/main/java/org/apache/ranger/audit/provider/BaseAuditHandler.java > 54f37644b > > agents-audit/src/main/java/org/apache/ranger/audit/provider/DummyAuditProvider.java > 05f882ff3 > agents-audit/src/main/java/org/apache/ranger/audit/provider/MiscUtil.java > e2b74489b > > agents-audit/src/main/java/org/apache/ranger/audit/provider/MultiDestAuditProvider.java > 282f5abfa > > agents-audit/src/main/java/org/apache/ranger/audit/queue/AuditFileCacheProviderSpool.java > 41513ba40 > > agents-audit/src/main/java/org/apache/ranger/audit/queue/AuditFileQueue.java > PRE-CREATION > > agents-audit/src/main/java/org/apache/ranger/audit/queue/AuditFileQueueSpool.java > PRE-CREATION > > agents-audit/src/main/java/org/apache/ranger/audit/utils/AbstractRangerAuditWriter.java > PRE-CREATION > agents-audit/src/main/java/org/apache/ranger/audit/utils/ORCFileUtil.java > PRE-CREATION > > agents-audit/src/main/java/org/apache/ranger/audit/utils/RangerAuditWriter.java > PRE-CREATION > > agents-audit/src/main/java/org/apache/ranger/audit/utils/RangerJSONAuditWriter.java > PRE-CREATION > > agents-audit/src/main/java/org/apache/ranger/audit/utils/RangerORCAuditWriter.java > PRE-CREATION > > > Diff: https://reviews.apache.org/r/63552/diff/7/ > > > Testing > ------- > > Testing done in local > > ORC FILE FORMAT in HDFS Ranger Audit log with local audit file store as > source for HDFS audit: > NOTE: When this is done each records in the local file will be read for > creating the ORC File. > > 1. Enable Ranger Audit to HDFS in ORC file format using AuditFileQueue > - To enable Ranger Audit to HDFS with ORC format, we need to first > enable AuditFileQueue to spool the audit to local first. > * In Namenode host, create spool directory and make sure the path > can be read/write/execute for owner of the Service for which Ranger plugin is > enabled ( e.g HDFS Service it is hdfs:hadoop, Hive Service it is hive:hadoop > ..etc) > > $ mkdir -p /var/log/hadoop/audit/staging/spool > $ cd /var/log/hadoop/audit/staging/spool > $ chown hdfs:hadoop spool > > * Enable AuditFileQueue via following params in > ranger-<component>-audit.xml > xasecure.audit.destination.hdfs.batch.queuetype=filequeue > (NOTE: default = memqueue which is the behaviour where a memory queue / > buffer is used instead of Local File buffer) > > xasecure.audit.destination.hdfs.batch.filequeue.filespool.file.rollover.sec=300 > ( This will determine the batch size for ORC file which is created) > > xasecure.audit.destination.hdfs.batch.filequeue.filespool.dir=/var/log/hadoop/audit/staging/spool > ( This is the local staging directory for audit) > > xasecure.audit.destination.hdfs.batch.filequeue.filespool.buffer.size=10000 > ( This will determine batch size for ORC file creation alone with > rollover.sec parameter) > > 2. Enable ORC fileformat for Ranger HDFS Audit. > - This is done by having the following param in > ranger-<component>-audit.xml. By default the value is "json" > > xasecure.audit.destination.hdfs.filetype=orc ( default = json ) > > 3. Provision to control the compression techniques for ORC format. > Default is 'snappy' > > xasecure.audit.destination.hdfs.orc.compression=snappy|lzo|zlip|none > > 4. Buffer Size and Stripe Size of ORC file batch. Default is '10000' > bytes and '100000' bytes respectively. This will decide the batch size on ORC > file in hdfs. > xasecure.audit.destination.hdfs.orc.buffersize= (value in bytes) > xasecure.audit.destination.hdfs.orc.stripesize= (value in bytes) > > 5. Hive Query to create ORC table with default 'snappy' compresssion. > > CREATE EXTERNAL TABLE ranger_audit_event ( > repositoryType int, > repositoryName string, > reqUser string, > evtTime string, > accessType string, > resourcePath string, > resourceType string, > action string, > accessResult string, > agentId string, > policyId bigint, > resultReason string, > aclEnforcer string, > sessionId string, > clientType string, > clientIP string, > requestData string, > clusterName string > ) > STORED AS ORC > LOCATION '/ranger/audit/hdfs' > TBLPROPERTIES ("orc.compress"="SNAPPY"); > > > ------------------------- > > JSON FILE FORMAT in HDFS Ranger Audit log with local audit file store as > source for HDFS audit: > NOTE: When this is done each local file will be copied entirely into > HDFS destination. This enables us to generate Ranger audit files in HDFS > which are larger in size which is a preferred. > > 1. Enable Ranger Audit to HDFS in JSON file format using AuditFileQueue > - To enable Ranger Audit to HDFS with JSON format and local file > cached, we need to first enable AuditFileQueue to spool the audit to locally. > > * In Namenode host, create spool directory and make sure the path > can be read/write/execute for owner of the Service for which Ranger plugin is > enabled (e.g HDFS Service it is hdfs:hadoop, Hive Service it is hive:hadoop > ..etc) > > $ mkdir -p /var/log/hadoop/audit/staging/spool > $ cd /var/log/hadoop/audit/staging/spool > $ chown hdfs:hadoop spool > > * Enable AuditFileQueue via following params in > ranger-<component>-audit.xml > xasecure.audit.destination.hdfs.batch.queuetype=filequeue ( > NOTE: default = memqueue which is the behaviour where a memory queue / > buffer is used instead of Local File buffer) > > xasecure.audit.destination.hdfs.batch.filequeue.filespool.file.rollover.sec=300 > ( This will determine the JSON file size which will be copied to HDFS) > > xasecure.audit.destination.hdfs.batch.filequeue.filespool.dir=/var/log/hadoop/audit/staging/spool > ( This is the local staging directory for audit) > > > Thanks, > > Ramesh Mani > >
