Edward

Thanks. I will look into HdfsAuditLogProcessorMain class.

I will upload the sample files in the JIRA. 



Thanks

Bosco


On 11/29/15, 7:56 PM, "Zhang, Edward (GDI Hadoop)" <[email protected]> wrote:

>One more thing, Bosco, could you please copy some sample hdfs audit log,
>hbase log and hive log to here?
>
>I realize with Ranger data source, we probably still need some minor code
>development as follows
>1. Substitute existing eagle data source(raw hdfs audit log) with Ranger
>data source, for example, in HdfsAuditLogProcessorMain, modify the code to
>use different log deserializer
>2. Ensure output of Ranger log deserializer is compatible to existing
>eagle data source.
>
>With the above code change, we can automatically get all capabilities like
>sensitivity data join, user hadoop command reassembly, hive query
>semantics parsing etc.
>
>Thanks
>Edward Zhang
>
>On 11/29/15, 18:52, "Zhang, Edward (GDI Hadoop)" <[email protected]> wrote:
>
>>Hi Bosco,
>>
>>Thanks for creating this ticket. It is very helpful if EAGLE can use
>>Ranger as data source and automatically get monitoring capability in 9
>>Hadoop components.
>>
>>If a datasource is not from Kafka, and needs a lot of pre-processing, it
>>is not trivial to integrate that data source.
>>
>>Ranger¹s data source should be uniform in syntax and the integration
>>should be straightforward, if we have a uniform deserializer.
>>
>>I think we can document the steps of integrating a new datasource.
>>
>>Thanks
>>Edward Zhang
>>
>>On 11/29/15, 12:00, "Don Bosco Durai" <[email protected]> wrote:
>>
>>>Hi Eagle team
>>>
>>>I am excited to see all the activities on this project. I have created a
>>>JIRA (https://issues.apache.org/jira/browse/EAGLE-59) to track the
>>>integration with Apache Ranger.
>>>
>>>One way to integrate is for Ranger to send the audit logs in the same way
>>>as native log format to Kafka. However, Ranger already is doing the
>>>normalization of the audit format for all the components. So
>>>reconstructing might not be a good way to go.
>>>
>>>I am still getting familiar with the internals of Apache Eagle, but if
>>>someone can help me or document how a 3rd party source can be integrated
>>>with Apache Eagle, then it will be great. Also, what is the change
>>>required on the analytics side to support new data sources? E.g. If we
>>>integrate with Ranger Audit Logs, we would get audit logs from around 9
>>>components right away. How can we use it?
>>>
>>>If you are okay, I am willing to work on this JIRA.
>>>
>>>Thanks
>>>
>>>Bosco
>>> 
>>>
>>
>

Reply via email to