[GitHub] spark pull request: [SPARK-3562]Periodic cleanup event logs

mattf Thu, 02 Oct 2014 06:34:11 -0700

Github user mattf commented on the pull request:

    https://github.com/apache/spark/pull/2471#issuecomment-57629838
  
    > @mattf don't know what you mean by "functionality that is already 
provided by the system". I'm not aware of HDFS having any way to automatically 
do housekeeping of old files.
    
    a system approach means using something like logrotate or a cleaner process 
that's run from cron.
    
    such an approach is beneficial in a number of ways, including reducing the 
complexity of spark by not duplicating functionality that's already available 
in spark's environment - akin to using a standard library for i/o instead of 
interacting w/ devices directly. in this case the context for the environment 
is the system, where you'll find things like logrotate and cron readily 
available.
    
    as for rotating logs in hdfs - i wouldn't expect hdfs to provide such a 
feature, because the feature serves a specific use case on top of hdfs. some 
searching suggests that there are a few solutions available for doing rotation 
or pruning of files in hdfs and points out that 
rotating/pruning/cleaning/purging can be done remotely and independently from 
spark since hdfs is distributed.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request: [SPARK-3562]Periodic cleanup event logs

Reply via email to