[jira] [Updated] (AMBARI-17785) Provide support for S3 as a first class destination for log events

Hemanth Yamijala (JIRA) Tue, 02 Aug 2016 22:05:02 -0700

     [ 
https://issues.apache.org/jira/browse/AMBARI-17785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Hemanth Yamijala updated AMBARI-17785:
--------------------------------------
    Attachment: AMBARI-17785-1.patch

Attaching a new patch with some improvements. The full set of functionality 
included in this patch

* Supports one at a time processing of log events in the {{OutputS3File}} case. 
These are spooled locally and uploaded periodically to S3.
* Supports upload based on two criteria - file size threshold, and time based 
threshold.
* Refactors code to achieve the above, while not duplicating any existing 
functions - for e.g. the code path to upload files all at once is still 
retained and uses the same helper classes like {{S3Uploader}} etc.
* Unit tests added for all new code.

There is still a lot left for this to be production quality - including error 
handling, configuration & security etc. Will take these up separately. I am 
still blocked by AMBARI-17788 to make this patch available, or upload it to 
review board.

> Provide support for S3 as a first class destination for log events
> ------------------------------------------------------------------
>
>                 Key: AMBARI-17785
>                 URL: https://issues.apache.org/jira/browse/AMBARI-17785
>             Project: Ambari
>          Issue Type: Improvement
>          Components: ambari-logsearch
>            Reporter: Hemanth Yamijala
>            Assignee: Hemanth Yamijala
>         Attachments: AMBARI-17785-1.patch, AMBARI-17785.patch
>
>
> AMBARI-17045 added support for uploading Hadoop service logs from machines to 
> S3. The intended usage there was as a one time trigger where, on-demand, the 
> log files matching certain paths can be uploaded to a given S3 bucket and 
> path.
> While useful, there are some use cases where we might need more than this one 
> time activity, particularly when clusters are deployed on ephemeral machines 
> such as cloud instances:
> * The machines running the logfeeder could be irrevocably lost and in that 
> case we would not be able to retrieve any logs.
> * If we are copying logs at one time, that were generated over a long period 
> of time, the time to copy all the logs at the end could extend cluster 
> up-time and cost.
> It would be nice to have an ability to support S3 as another output 
> destination in logsearch just like Kafka, Solr etc. This JIRA is to track 
> work towards this enhancement.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (AMBARI-17785) Provide support for S3 as a first class destination for log events

Reply via email to