[jira] [Commented] (RANGER-1938) Solr for Audit setup doesn't use DocValues effectively

Don Bosco Durai (JIRA) Wed, 20 Dec 2017 04:50:58 -0800

    [ 
https://issues.apache.org/jira/browse/RANGER-1938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16298427#comment-16298427
 ]


Don Bosco Durai commented on RANGER-1938:
-----------------------------------------

[~risdenk], thanks for your suggestions and the blog. I have a link to it from 
the Apache Ranger Wiki.

Few notes:
# If Apache Ambari is used to manage Ranger, then Ambari has its own template 
for solr-config
# If you are using Ambari-Infra Solr, then it is shared by Ambari LogSearch and 
Apache Atlas
# We have to emphasize that changing the schema requires rebuilding of Solr 
Collection. Which means all existing audits from Solr will be deleted. There is 
an option to rebuild it from the audits stored in HDFS, but currently, there 
are no known documentation or scripts for that
# I have given review comments at review board. Essentially, I would prefer to 
set the docValues at individual field level, rather than at global/default 
fieldType level.
# 
I have a request. I know it is very difficult to recommend configuration 
settings for Solr. With your current setup, can you share the configuration you 
currently have?
1. Memory setting for Solr
2. Number of shards and replication
3. Number of days for TTL
4. Max number of documents (based on TTL)
5. Are Solr instances running on dedicated servers or one of the master servers?

Thanks

> Solr for Audit setup doesn't use DocValues effectively
> ------------------------------------------------------
>
>                 Key: RANGER-1938
>                 URL: https://issues.apache.org/jira/browse/RANGER-1938
>             Project: Ranger
>          Issue Type: Bug
>          Components: audit
>    Affects Versions: 0.6.0, 0.7.0, 0.6.1, 0.6.2, 0.6.3, 0.7.1
>            Reporter: Kevin Risden
>            Assignee: Kevin Risden
>              Labels: performance
>             Fix For: 1.0.0, 0.7.2
>
>         Attachments: 
> 0001-RANGER-1938-Enable-DocValues-for-more-fields-in-Solr.patch
>
>
> Ranger uses Ambari Infra Solr (or another Apache Solr install) for storing 
> Ranger Audit events for displaying in Ranger Admin. In our case, we have 
> noticed quite a few Ambari Infra Solr OOM due to Ranger. I've talked with a 
> few other people who are having very similar problems with OOM errors.
> I've typed up some details about how the way Ranger is using Solr requires a 
> lot of heap. I've also outlined the fix for this which significantly reduced 
> the amount of heap memory required. I'm an Apache Lucene/Solr committer so 
> this optimization/usage might not be immediately obvious to those using Solr 
> especially version 5.x.
> https://risdenk.github.io/2017/12/18/ambari-infra-solr-ranger.html



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (RANGER-1938) Solr for Audit setup doesn't use DocValues effectively

Reply via email to