Rahul Singhal created SPARK-2127:
------------------------------------

             Summary: Use application specific folders to dump metrics via 
CsvSink
                 Key: SPARK-2127
                 URL: https://issues.apache.org/jira/browse/SPARK-2127
             Project: Spark
          Issue Type: Improvement
          Components: Spark Core
    Affects Versions: 1.0.0
            Reporter: Rahul Singhal
            Priority: Minor


Currently when using the CsvSink, all application's csv metrics are dumped in 
the root folder (configured via "*.sink.csv.director" in metrics.properties). 
Also, some files that have common names (e.g. "jvm.PS-MarkSweep.count.csv") are 
reused. And if one is running the same application multiple times, the metrics 
get appended to previously existing files.

This makes it harder to parse these files and extract the information that one 
might be looking for. I suggest that a unique folder is created every time an 
application is run and use it to dump the metrics from that particular run 
only. This unique folder could be created similar the one that is currently 
craeted for logging application events (e.g. "spark-pi-1402484928439").



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to