Hey!

I don't think you can do selectively removals, never heard of it but who
knows..

You can refer here to see all the available options ->
https://spark.apache.org/docs/latest/monitoring.html .

In my experience having 4 days worth of logs is enough, usually if
something fails you check it right away unless it is the weekend, but
depending on the use case you could store more days..



On Mon, 22 Mar 2021 at 23:52, Hung Vu <h...@snapchat.com.invalid> wrote:

> Hi,
>
> I have couple questions to ask regarding the Spark history server:
>
> 1. Is there a way for a cluster to selectively clean old files? For
> example, if we want to keep some logs from 3 days ago but also cleaned some
> logs from 2 days ago, is there a filter or config to do that?
> 2. We have over 1000 log files each day. If we want to keep those jobs for
> a week (7000 jobs in total), this would potentially make the load time
> longer. Is there any suggestion on doing this?
> 3. We plan to have 2 paths to long-term history server and short-term
> history server. We can move some log files from short-term to long-term
> server if we need to do some investigation on that, would this be a good
> idea. Do you have any input on this?
>
> Thank you in advance!
>

Reply via email to