[ 
https://issues.apache.org/jira/browse/AMBARI-21810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Olivér Szabó updated AMBARI-21810:
----------------------------------
    Description: 
In Ambari 3.0, LogSearch will include more fully-featured support in this area, 
but this current script will be used in Ambari 2.6.0, as a way to simplify the 
customer's use cases in the areas of log data retention, log purging, and log 
archiving.

The script solrDataManager.py (which is located inside 
/usr/lib/ambari-infra-solr-client folder) accepts a mode parameter, which may 
be delete or save. In both cases the user may specify the filter field, an end 
value, or the number of days to keep, and potentially kerberos keytab/principal 
for solr. In case of "save" mode the user should specify either arguments for 
HDFS, S3, or a local path to save to. The user may also specify the size of the 
read block ( documents returned by one solr query ) and the write block ( 
documents in an output file )

Examples:

Save data from the solr collection hadoop_logs accessible at 
http://c6401.ambari.apache.org:8886/solr based on the field logtime, save 
everything older than 1 day, read 10 documents at once, write 100 documents 
into a file, and copy the zip files into the local directory /tmp. Do this in 
verbose mode:
{code:java}
/usr/bin/python solrDataManager.py -m save -s 
http://c6401.ambari.apache.org:8886/solr -c hadoop_logs -f logtime -d 1 -r 10 
-w 100 -x /tmp -v
{code}

Save the last 3 days of hadoop_logs into HDFS path "/" with the user hdfs, 
fetching data from a kerberized Solr:
{code:java}
/usr/bin/python solrDataManager.py -m save -s 
http://c6401.ambari.apache.org:8886/solr -c hadoop_logs -f logtime -d 3 -r 10 
-w 100 -k /etc/security/keytabs/ambari-infra-solr.service.keytab -n 
infra-solr/[email protected] -u hdfs -p /
{code}

Delete the data before 2017-08-29T12:00:00.000Z:
{code:java}
/usr/bin/python solrDataManager.py -m delete -s 
http://c6401.ambari.apache.org:8886/solr -c hadoop_logs -f logtime -e 
2017-08-29T12:00:00.000Z
{code}

> Create Utility Script to support Solr Collection Data 
> Retention/Purging/Archiving
> ---------------------------------------------------------------------------------
>
>                 Key: AMBARI-21810
>                 URL: https://issues.apache.org/jira/browse/AMBARI-21810
>             Project: Ambari
>          Issue Type: Bug
>          Components: ambari-infra
>    Affects Versions: 2.6.0
>            Reporter: Miklos Gergely
>            Assignee: Miklos Gergely
>             Fix For: 2.6.0
>
>         Attachments: AMBARI-21810.patch
>
>
> In Ambari 3.0, LogSearch will include more fully-featured support in this 
> area, but this current script will be used in Ambari 2.6.0, as a way to 
> simplify the customer's use cases in the areas of log data retention, log 
> purging, and log archiving.
> The script solrDataManager.py (which is located inside 
> /usr/lib/ambari-infra-solr-client folder) accepts a mode parameter, which may 
> be delete or save. In both cases the user may specify the filter field, an 
> end value, or the number of days to keep, and potentially kerberos 
> keytab/principal for solr. In case of "save" mode the user should specify 
> either arguments for HDFS, S3, or a local path to save to. The user may also 
> specify the size of the read block ( documents returned by one solr query ) 
> and the write block ( documents in an output file )
> Examples:
> Save data from the solr collection hadoop_logs accessible at 
> http://c6401.ambari.apache.org:8886/solr based on the field logtime, save 
> everything older than 1 day, read 10 documents at once, write 100 documents 
> into a file, and copy the zip files into the local directory /tmp. Do this in 
> verbose mode:
> {code:java}
> /usr/bin/python solrDataManager.py -m save -s 
> http://c6401.ambari.apache.org:8886/solr -c hadoop_logs -f logtime -d 1 -r 10 
> -w 100 -x /tmp -v
> {code}
> Save the last 3 days of hadoop_logs into HDFS path "/" with the user hdfs, 
> fetching data from a kerberized Solr:
> {code:java}
> /usr/bin/python solrDataManager.py -m save -s 
> http://c6401.ambari.apache.org:8886/solr -c hadoop_logs -f logtime -d 3 -r 10 
> -w 100 -k /etc/security/keytabs/ambari-infra-solr.service.keytab -n 
> infra-solr/[email protected] -u hdfs -p /
> {code}
> Delete the data before 2017-08-29T12:00:00.000Z:
> {code:java}
> /usr/bin/python solrDataManager.py -m delete -s 
> http://c6401.ambari.apache.org:8886/solr -c hadoop_logs -f logtime -e 
> 2017-08-29T12:00:00.000Z
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to