[ 
https://issues.apache.org/jira/browse/SPARK-10774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

yangping wu updated SPARK-10774:
--------------------------------
    Description: 
Right now, Spark logging all event logs(inprogress or finished)  into the some 
directory(configuration by the **spark.eventLog.dir** parameter) as following:
{noformat}
[[email protected] /]$ sudo hadoop fs -ls /spark-jobs/eventLog
Found 58 items
-rwxrwxrwx   3 spark aaa        8438 2015-09-17 15:14 
/spark-jobs/eventLog/application_1440152921247_0047_1.lz4
-rwxrwxrwx   3 spark aaa       44002 2015-09-17 15:15 
/spark-jobs/eventLog/application_1440152921247_0190_1
-rwxrwxrwx   3 spark aaa       44696 2015-09-17 15:15 
/spark-jobs/eventLog/application_1440152921247_0190_2
-rwxrwxrwx   3 spark aaa       40813 2015-09-17 15:25 
/spark-jobs/eventLog/application_1440152921247_0191_1
-rwxrwxrwx   3 spark aaa       44680 2015-09-17 15:25 
/spark-jobs/eventLog/application_1440152921247_0191_2
-rwxrwxrwx   3 spark aaa       42572 2015-09-17 15:36 
/spark-jobs/eventLog/application_1440152921247_0192_1
-rwxrwxrwx   3 spark aaa       44680 2015-09-17 15:36 
/spark-jobs/eventLog/application_1440152921247_0192_2
-rwxrwxrwx   3 spark aaa       45052 2015-09-17 16:09 
/spark-jobs/eventLog/application_1440152921247_0193_1
-rwxrwxrwx   3 spark aaa       44688 2015-09-17 16:09 
/spark-jobs/eventLog/application_1440152921247_0193_2
-rwxrwxrwx   3 spark aaa       41686 2015-09-17 16:11 
/spark-jobs/eventLog/application_1440152921247_0194_1
-rwxrwxrwx   3 spark aaa       44522 2015-09-17 16:11 
/spark-jobs/eventLog/application_1440152921247_0194_2
-rwxrwxrwx   3 spark aaa       32261 2015-09-17 16:13 
/spark-jobs/eventLog/application_1440152921247_0195_1
-rwxrwxrwx   3 spark aaa       31178 2015-09-17 16:13 
/spark-jobs/eventLog/application_1440152921247_0195_2
-rwxrwxrwx   3 spark aaa 39124467712 2015-09-18 11:58 
/spark-jobs/eventLog/application_1440152921247_0205_1.inprogress
-rwxrwxrwx   3 spark aaa   790045092 2015-09-18 20:40 
/spark-jobs/eventLog/application_1440152921247_0206
........
{noformat}

As time goes by, there will be a lot of event log in the **spark.eventLog.dir** 
directory and will not easy to manage.  In hadoop, there  are two types of 
directory to save different type event logs: done-dir and 
intermediate-done-dir, configuration by **mapreduce.jobhistory.done-dir** and 
**mapreduce.jobhistory.intermediate-done-dir** respectively. and in the 
"done-dir", event logs were save to different  directory  according to the 
running time of the job as following:
{noformat}
[[email protected] /]$sudo hadoop fs -ls  
/hadoop-jobs/done/2015/09/
Found 23 items
drwxrwxrwx   - hadoop supergroup    0 2015-09-04 16:59 
/hadoop-jobs/done/2015/09/01
drwxrwxrwx   - hadoop supergroup    0 2015-09-05 16:59 
/hadoop-jobs/done/2015/09/02
drwxrwxrwx   - hadoop supergroup    0 2015-09-06 16:59 
/hadoop-jobs/done/2015/09/03
drwxrwxrwx   - hadoop supergroup    0 2015-09-07 16:59 
/hadoop-jobs/done/2015/09/04
drwxrwxrwx   - hadoop supergroup    0 2015-09-08 16:59 
/hadoop-jobs/done/2015/09/05
drwxrwxrwx   - hadoop supergroup    0 2015-09-09 16:59 
/hadoop-jobs/done/2015/09/06
drwxrwxrwx   - hadoop supergroup    0 2015-09-10 16:59 
/hadoop-jobs/done/2015/09/07
drwxrwxrwx   - hadoop supergroup    0 2015-09-11 16:59 
/hadoop-jobs/done/2015/09/08
drwxrwxrwx   - hadoop supergroup    0 2015-09-12 16:59 
/hadoop-jobs/done/2015/09/09
drwxrwxrwx   - hadoop supergroup    0 2015-09-13 16:59 
/hadoop-jobs/done/2015/09/10
drwxrwx---   - hadoop supergroup    0 2015-09-14 16:59 
/hadoop-jobs/done/2015/09/11
drwxrwx---   - hadoop supergroup    0 2015-09-15 16:59 
/hadoop-jobs/done/2015/09/12
drwxrwxrwx   - hadoop supergroup    0 2015-09-16 16:59 
/hadoop-jobs/done/2015/09/13
drwxrwxrwx   - hadoop supergroup    0 2015-09-17 16:59 
/hadoop-jobs/done/2015/09/14
drwxrwxrwx   - hadoop supergroup    0 2015-09-18 16:59 
/hadoop-jobs/done/2015/09/15
drwxrwxrwx   - hadoop supergroup    0 2015-09-19 16:59 
/hadoop-jobs/done/2015/09/16
drwxrwxrwx   - hadoop supergroup    0 2015-09-20 16:59 
/hadoop-jobs/done/2015/09/17
drwxrwx---   - hadoop supergroup    0 2015-09-21 16:59 
/hadoop-jobs/done/2015/09/18
drwxrwx---   - hadoop supergroup    0 2015-09-22 16:59 
/hadoop-jobs/done/2015/09/19
drwxrwx---   - hadoop supergroup    0 2015-09-23 16:59 
/hadoop-jobs/done/2015/09/20
drwxrwx---   - hadoop supergroup    0 2015-09-23 16:59 
/hadoop-jobs/done/2015/09/21
drwxrwx---   - hadoop supergroup    0 2015-09-22 23:43 
/hadoop-jobs/done/2015/09/22
drwxrwx---   - hadoop supergroup    0 2015-09-23 18:55 
/hadoop-jobs/done/2015/09/23
{noformat}

In Spark, I think we can do the same thing.

  was:
Right now, Spark logging all event logs(inprogress or finished)  into the some 
directory(configuration by the **spark.eventLog.dir** parameter) as following:
{noformat}
[[email protected] /]$ sudo hadoop fs -ls /spark-jobs/eventLog
Found 58 items
-rwxrwxrwx   3 spark aaa        8438 2015-09-17 15:14 
/spark-jobs/eventLog/application_1440152921247_0047_1.lz4
-rwxrwxrwx   3 spark aaa       44002 2015-09-17 15:15 
/spark-jobs/eventLog/application_1440152921247_0190_1
-rwxrwxrwx   3 spark aaa       44696 2015-09-17 15:15 
/spark-jobs/eventLog/application_1440152921247_0190_2
-rwxrwxrwx   3 spark aaa       40813 2015-09-17 15:25 
/spark-jobs/eventLog/application_1440152921247_0191_1
-rwxrwxrwx   3 spark aaa       44680 2015-09-17 15:25 
/spark-jobs/eventLog/application_1440152921247_0191_2
-rwxrwxrwx   3 spark aaa       42572 2015-09-17 15:36 
/spark-jobs/eventLog/application_1440152921247_0192_1
-rwxrwxrwx   3 spark aaa       44680 2015-09-17 15:36 
/spark-jobs/eventLog/application_1440152921247_0192_2
-rwxrwxrwx   3 spark aaa       45052 2015-09-17 16:09 
/spark-jobs/eventLog/application_1440152921247_0193_1
-rwxrwxrwx   3 spark aaa       44688 2015-09-17 16:09 
/spark-jobs/eventLog/application_1440152921247_0193_2
-rwxrwxrwx   3 spark aaa       41686 2015-09-17 16:11 
/spark-jobs/eventLog/application_1440152921247_0194_1
-rwxrwxrwx   3 spark aaa       44522 2015-09-17 16:11 
/spark-jobs/eventLog/application_1440152921247_0194_2
-rwxrwxrwx   3 spark aaa       32261 2015-09-17 16:13 
/spark-jobs/eventLog/application_1440152921247_0195_1
-rwxrwxrwx   3 spark aaa       31178 2015-09-17 16:13 
/spark-jobs/eventLog/application_1440152921247_0195_2
-rwxrwxrwx   3 spark aaa 39124467712 2015-09-18 11:58 
/spark-jobs/eventLog/application_1440152921247_0205_1.inprogress
-rwxrwxrwx   3 spark aaa   790045092 2015-09-18 20:40 
/spark-jobs/eventLog/application_1440152921247_0206
........
{noformat}

As time goes by, there will be a lot of event log in the **spark.eventLog.dir** 
directory and not easy to manage.  In hadoop, there  are two types of directory 
to save different type event logs: done-dir and intermediate-done-dir, 
configuration by **mapreduce.jobhistory.done-dir** and 
**mapreduce.jobhistory.intermediate-done-dir** respectively. and in the 
"done-dir", event logs were save to different  directory  according to the 
running time of the job as following:
{noformat}
[[email protected] /]$sudo hadoop fs -ls  
/hadoop-jobs/done/2015/09/
Found 23 items
drwxrwxrwx   - hadoop supergroup    0 2015-09-04 16:59 
/hadoop-jobs/done/2015/09/01
drwxrwxrwx   - hadoop supergroup    0 2015-09-05 16:59 
/hadoop-jobs/done/2015/09/02
drwxrwxrwx   - hadoop supergroup    0 2015-09-06 16:59 
/hadoop-jobs/done/2015/09/03
drwxrwxrwx   - hadoop supergroup    0 2015-09-07 16:59 
/hadoop-jobs/done/2015/09/04
drwxrwxrwx   - hadoop supergroup    0 2015-09-08 16:59 
/hadoop-jobs/done/2015/09/05
drwxrwxrwx   - hadoop supergroup    0 2015-09-09 16:59 
/hadoop-jobs/done/2015/09/06
drwxrwxrwx   - hadoop supergroup    0 2015-09-10 16:59 
/hadoop-jobs/done/2015/09/07
drwxrwxrwx   - hadoop supergroup    0 2015-09-11 16:59 
/hadoop-jobs/done/2015/09/08
drwxrwxrwx   - hadoop supergroup    0 2015-09-12 16:59 
/hadoop-jobs/done/2015/09/09
drwxrwxrwx   - hadoop supergroup    0 2015-09-13 16:59 
/hadoop-jobs/done/2015/09/10
drwxrwx---   - hadoop supergroup    0 2015-09-14 16:59 
/hadoop-jobs/done/2015/09/11
drwxrwx---   - hadoop supergroup    0 2015-09-15 16:59 
/hadoop-jobs/done/2015/09/12
drwxrwxrwx   - hadoop supergroup    0 2015-09-16 16:59 
/hadoop-jobs/done/2015/09/13
drwxrwxrwx   - hadoop supergroup    0 2015-09-17 16:59 
/hadoop-jobs/done/2015/09/14
drwxrwxrwx   - hadoop supergroup    0 2015-09-18 16:59 
/hadoop-jobs/done/2015/09/15
drwxrwxrwx   - hadoop supergroup    0 2015-09-19 16:59 
/hadoop-jobs/done/2015/09/16
drwxrwxrwx   - hadoop supergroup    0 2015-09-20 16:59 
/hadoop-jobs/done/2015/09/17
drwxrwx---   - hadoop supergroup    0 2015-09-21 16:59 
/hadoop-jobs/done/2015/09/18
drwxrwx---   - hadoop supergroup    0 2015-09-22 16:59 
/hadoop-jobs/done/2015/09/19
drwxrwx---   - hadoop supergroup    0 2015-09-23 16:59 
/hadoop-jobs/done/2015/09/20
drwxrwx---   - hadoop supergroup    0 2015-09-23 16:59 
/hadoop-jobs/done/2015/09/21
drwxrwx---   - hadoop supergroup    0 2015-09-22 23:43 
/hadoop-jobs/done/2015/09/22
drwxrwx---   - hadoop supergroup    0 2015-09-23 18:55 
/hadoop-jobs/done/2015/09/23
{noformat}

In Spark, I think we can do the same thing.


> Put different event log to different directory according to different 
> conditions
> --------------------------------------------------------------------------------
>
>                 Key: SPARK-10774
>                 URL: https://issues.apache.org/jira/browse/SPARK-10774
>             Project: Spark
>          Issue Type: Improvement
>          Components: Spark Core
>    Affects Versions: 1.4.1
>            Reporter: yangping wu
>            Priority: Minor
>
> Right now, Spark logging all event logs(inprogress or finished)  into the 
> some directory(configuration by the **spark.eventLog.dir** parameter) as 
> following:
> {noformat}
> [[email protected] /]$ sudo hadoop fs -ls 
> /spark-jobs/eventLog
> Found 58 items
> -rwxrwxrwx   3 spark aaa        8438 2015-09-17 15:14 
> /spark-jobs/eventLog/application_1440152921247_0047_1.lz4
> -rwxrwxrwx   3 spark aaa       44002 2015-09-17 15:15 
> /spark-jobs/eventLog/application_1440152921247_0190_1
> -rwxrwxrwx   3 spark aaa       44696 2015-09-17 15:15 
> /spark-jobs/eventLog/application_1440152921247_0190_2
> -rwxrwxrwx   3 spark aaa       40813 2015-09-17 15:25 
> /spark-jobs/eventLog/application_1440152921247_0191_1
> -rwxrwxrwx   3 spark aaa       44680 2015-09-17 15:25 
> /spark-jobs/eventLog/application_1440152921247_0191_2
> -rwxrwxrwx   3 spark aaa       42572 2015-09-17 15:36 
> /spark-jobs/eventLog/application_1440152921247_0192_1
> -rwxrwxrwx   3 spark aaa       44680 2015-09-17 15:36 
> /spark-jobs/eventLog/application_1440152921247_0192_2
> -rwxrwxrwx   3 spark aaa       45052 2015-09-17 16:09 
> /spark-jobs/eventLog/application_1440152921247_0193_1
> -rwxrwxrwx   3 spark aaa       44688 2015-09-17 16:09 
> /spark-jobs/eventLog/application_1440152921247_0193_2
> -rwxrwxrwx   3 spark aaa       41686 2015-09-17 16:11 
> /spark-jobs/eventLog/application_1440152921247_0194_1
> -rwxrwxrwx   3 spark aaa       44522 2015-09-17 16:11 
> /spark-jobs/eventLog/application_1440152921247_0194_2
> -rwxrwxrwx   3 spark aaa       32261 2015-09-17 16:13 
> /spark-jobs/eventLog/application_1440152921247_0195_1
> -rwxrwxrwx   3 spark aaa       31178 2015-09-17 16:13 
> /spark-jobs/eventLog/application_1440152921247_0195_2
> -rwxrwxrwx   3 spark aaa 39124467712 2015-09-18 11:58 
> /spark-jobs/eventLog/application_1440152921247_0205_1.inprogress
> -rwxrwxrwx   3 spark aaa   790045092 2015-09-18 20:40 
> /spark-jobs/eventLog/application_1440152921247_0206
> ........
> {noformat}
> As time goes by, there will be a lot of event log in the 
> **spark.eventLog.dir** directory and will not easy to manage.  In hadoop, 
> there  are two types of directory to save different type event logs: done-dir 
> and intermediate-done-dir, configuration by **mapreduce.jobhistory.done-dir** 
> and **mapreduce.jobhistory.intermediate-done-dir** respectively. and in the 
> "done-dir", event logs were save to different  directory  according to the 
> running time of the job as following:
> {noformat}
> [[email protected] /]$sudo hadoop fs -ls  
> /hadoop-jobs/done/2015/09/
> Found 23 items
> drwxrwxrwx   - hadoop supergroup    0 2015-09-04 16:59 
> /hadoop-jobs/done/2015/09/01
> drwxrwxrwx   - hadoop supergroup    0 2015-09-05 16:59 
> /hadoop-jobs/done/2015/09/02
> drwxrwxrwx   - hadoop supergroup    0 2015-09-06 16:59 
> /hadoop-jobs/done/2015/09/03
> drwxrwxrwx   - hadoop supergroup    0 2015-09-07 16:59 
> /hadoop-jobs/done/2015/09/04
> drwxrwxrwx   - hadoop supergroup    0 2015-09-08 16:59 
> /hadoop-jobs/done/2015/09/05
> drwxrwxrwx   - hadoop supergroup    0 2015-09-09 16:59 
> /hadoop-jobs/done/2015/09/06
> drwxrwxrwx   - hadoop supergroup    0 2015-09-10 16:59 
> /hadoop-jobs/done/2015/09/07
> drwxrwxrwx   - hadoop supergroup    0 2015-09-11 16:59 
> /hadoop-jobs/done/2015/09/08
> drwxrwxrwx   - hadoop supergroup    0 2015-09-12 16:59 
> /hadoop-jobs/done/2015/09/09
> drwxrwxrwx   - hadoop supergroup    0 2015-09-13 16:59 
> /hadoop-jobs/done/2015/09/10
> drwxrwx---   - hadoop supergroup    0 2015-09-14 16:59 
> /hadoop-jobs/done/2015/09/11
> drwxrwx---   - hadoop supergroup    0 2015-09-15 16:59 
> /hadoop-jobs/done/2015/09/12
> drwxrwxrwx   - hadoop supergroup    0 2015-09-16 16:59 
> /hadoop-jobs/done/2015/09/13
> drwxrwxrwx   - hadoop supergroup    0 2015-09-17 16:59 
> /hadoop-jobs/done/2015/09/14
> drwxrwxrwx   - hadoop supergroup    0 2015-09-18 16:59 
> /hadoop-jobs/done/2015/09/15
> drwxrwxrwx   - hadoop supergroup    0 2015-09-19 16:59 
> /hadoop-jobs/done/2015/09/16
> drwxrwxrwx   - hadoop supergroup    0 2015-09-20 16:59 
> /hadoop-jobs/done/2015/09/17
> drwxrwx---   - hadoop supergroup    0 2015-09-21 16:59 
> /hadoop-jobs/done/2015/09/18
> drwxrwx---   - hadoop supergroup    0 2015-09-22 16:59 
> /hadoop-jobs/done/2015/09/19
> drwxrwx---   - hadoop supergroup    0 2015-09-23 16:59 
> /hadoop-jobs/done/2015/09/20
> drwxrwx---   - hadoop supergroup    0 2015-09-23 16:59 
> /hadoop-jobs/done/2015/09/21
> drwxrwx---   - hadoop supergroup    0 2015-09-22 23:43 
> /hadoop-jobs/done/2015/09/22
> drwxrwx---   - hadoop supergroup    0 2015-09-23 18:55 
> /hadoop-jobs/done/2015/09/23
> {noformat}
> In Spark, I think we can do the same thing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to