spark git commit: [SPARK-16078][SQL] Backport: from_utc_timestamp/to_utc_timestamp should not depends on local timezone

2016-10-19 Thread rxin
Repository: spark
Updated Branches:
  refs/heads/branch-1.6 b95ac0d00 -> 82e98f126


[SPARK-16078][SQL] Backport: from_utc_timestamp/to_utc_timestamp should not 
depends on local timezone

## What changes were proposed in this pull request?

Back-port of https://github.com/apache/spark/pull/13784 to `branch-1.6`

## How was this patch tested?

Existing tests.

Author: Davies Liu 

Closes #15554 from srowen/SPARK-16078.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/82e98f12
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/82e98f12
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/82e98f12

Branch: refs/heads/branch-1.6
Commit: 82e98f1265f98b49893e04590989b623169d66d9
Parents: b95ac0d
Author: Davies Liu 
Authored: Wed Oct 19 22:55:30 2016 -0700
Committer: Reynold Xin 
Committed: Wed Oct 19 22:55:30 2016 -0700

--
 .../expressions/datetimeExpressions.scala   | 10 +--
 .../spark/sql/catalyst/util/DateTimeUtils.scala | 35 +--
 .../sql/catalyst/util/DateTimeUtilsSuite.scala  | 65 
 3 files changed, 74 insertions(+), 36 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/82e98f12/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala
--
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala
index 03c39f8..91eca24 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala
@@ -658,16 +658,17 @@ case class FromUTCTimestamp(left: Expression, right: 
Expression)
  """.stripMargin
   } else {
 val tzTerm = ctx.freshName("tz")
+val utcTerm = ctx.freshName("utc")
 val tzClass = classOf[TimeZone].getName
 ctx.addMutableState(tzClass, tzTerm, s"""$tzTerm = 
$tzClass.getTimeZone("$tz");""")
+ctx.addMutableState(tzClass, utcTerm, s"""$utcTerm = 
$tzClass.getTimeZone("UTC");""")
 val eval = left.gen(ctx)
 s"""
|${eval.code}
|boolean ${ev.isNull} = ${eval.isNull};
|long ${ev.value} = 0;
|if (!${ev.isNull}) {
-   |  ${ev.value} = ${eval.value} +
-   |   ${tzTerm}.getOffset(${eval.value} / 1000) * 1000L;
+   |  ${ev.value} = $dtu.convertTz(${eval.value}, $utcTerm, $tzTerm);
|}
  """.stripMargin
   }
@@ -783,16 +784,17 @@ case class ToUTCTimestamp(left: Expression, right: 
Expression)
  """.stripMargin
   } else {
 val tzTerm = ctx.freshName("tz")
+val utcTerm = ctx.freshName("utc")
 val tzClass = classOf[TimeZone].getName
 ctx.addMutableState(tzClass, tzTerm, s"""$tzTerm = 
$tzClass.getTimeZone("$tz");""")
+ctx.addMutableState(tzClass, utcTerm, s"""$utcTerm = 
$tzClass.getTimeZone("UTC");""")
 val eval = left.gen(ctx)
 s"""
|${eval.code}
|boolean ${ev.isNull} = ${eval.isNull};
|long ${ev.value} = 0;
|if (!${ev.isNull}) {
-   |  ${ev.value} = ${eval.value} -
-   |   ${tzTerm}.getOffset(${eval.value} / 1000) * 1000L;
+   |  ${ev.value} = $dtu.convertTz(${eval.value}, $tzTerm, $utcTerm);
|}
  """.stripMargin
   }

http://git-wip-us.apache.org/repos/asf/spark/blob/82e98f12/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeUtils.scala
--
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeUtils.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeUtils.scala
index 157ac2b..36fe11c 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeUtils.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeUtils.scala
@@ -55,6 +55,7 @@ object DateTimeUtils {
   // this is year -17999, calculation: 50 * daysIn400Year
   final val YearZero = -17999
   final val toYearZero = to2001 + 7304850
+  final val TimeZoneGMT = TimeZone.getTimeZone("GMT")
 
   @transient lazy val defaultTimeZone = TimeZone.getDefault
 
@@ -855,13 +856,37 @@ object DateTimeUtils {
   }
 
   /**
+   * Convert the timestamp `ts` from one timezone to another.
+   *
+   * TODO: Because of DST, the conversion between UTC and human time is not 
exactly one-to-one
+   * mapping, the conversion here may return 

spark git commit: [SPARK-18012][SQL] Simplify WriterContainer

2016-10-19 Thread lian
Repository: spark
Updated Branches:
  refs/heads/master 4b2011ec9 -> f313117bc


[SPARK-18012][SQL] Simplify WriterContainer

## What changes were proposed in this pull request?
This patch refactors WriterContainer to simplify the logic and make control 
flow more obvious.The previous code setup made it pretty difficult to track the 
actual dependencies on variables and setups because the driver side and the 
executor side were using the same set of variables.

## How was this patch tested?
N/A - this should be covered by existing tests.

Author: Reynold Xin 

Closes #15551 from rxin/writercontainer-refactor.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/f313117b
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/f313117b
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/f313117b

Branch: refs/heads/master
Commit: f313117bc93b0bf560528b316d3e6947caa96296
Parents: 4b2011e
Author: Reynold Xin 
Authored: Wed Oct 19 22:22:35 2016 -0700
Committer: Cheng Lian 
Committed: Wed Oct 19 22:22:35 2016 -0700

--
 .../InsertIntoHadoopFsRelationCommand.scala |  79 +--
 .../sql/execution/datasources/WriteOutput.scala | 480 +++
 .../execution/datasources/WriterContainer.scala | 445 -
 .../org/apache/spark/sql/internal/SQLConf.scala |   9 -
 4 files changed, 492 insertions(+), 521 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/f313117b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/InsertIntoHadoopFsRelationCommand.scala
--
diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/InsertIntoHadoopFsRelationCommand.scala
 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/InsertIntoHadoopFsRelationCommand.scala
index 99ca3df..22dbe71 100644
--- 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/InsertIntoHadoopFsRelationCommand.scala
+++ 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/InsertIntoHadoopFsRelationCommand.scala
@@ -20,18 +20,12 @@ package org.apache.spark.sql.execution.datasources
 import java.io.IOException
 
 import org.apache.hadoop.fs.Path
-import org.apache.hadoop.mapreduce._
-import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat
 
-import org.apache.spark._
 import org.apache.spark.sql._
 import org.apache.spark.sql.catalyst.catalog.BucketSpec
-import org.apache.spark.sql.catalyst.expressions.{Attribute, AttributeSet}
+import org.apache.spark.sql.catalyst.expressions.Attribute
 import org.apache.spark.sql.catalyst.plans.logical.LogicalPlan
-import org.apache.spark.sql.catalyst.InternalRow
-import org.apache.spark.sql.execution.SQLExecution
 import org.apache.spark.sql.execution.command.RunnableCommand
-import org.apache.spark.sql.internal.SQLConf
 
 /**
  * A command for writing data to a [[HadoopFsRelation]].  Supports both 
overwriting and appending.
@@ -40,20 +34,6 @@ import org.apache.spark.sql.internal.SQLConf
  * implementation of [[HadoopFsRelation]] should use this UUID together with 
task id to generate
  * unique file path for each task output file.  This UUID is passed to 
executor side via a
  * property named `spark.sql.sources.writeJobUUID`.
- *
- * Different writer containers, [[DefaultWriterContainer]] and 
[[DynamicPartitionWriterContainer]]
- * are used to write to normal tables and tables with dynamic partitions.
- *
- * Basic work flow of this command is:
- *
- *   1. Driver side setup, including output committer initialization and data 
source specific
- *  preparation work for the write job to be issued.
- *   2. Issues a write job consists of one or more executor side tasks, each 
of which writes all
- *  rows within an RDD partition.
- *   3. If no exception is thrown in a task, commits that task, otherwise 
aborts that task;  If any
- *  exception is thrown during task commitment, also aborts that task.
- *   4. If all tasks are committed, commit the job, otherwise aborts the job;  
If any exception is
- *  thrown during job commitment, also aborts the job.
  */
 case class InsertIntoHadoopFsRelationCommand(
 outputPath: Path,
@@ -103,52 +83,17 @@ case class InsertIntoHadoopFsRelationCommand(
 val isAppend = pathExists && (mode == SaveMode.Append)
 
 if (doInsertion) {
-  val job = Job.getInstance(hadoopConf)
-  job.setOutputKeyClass(classOf[Void])
-  job.setOutputValueClass(classOf[InternalRow])
-  FileOutputFormat.setOutputPath(job, qualifiedOutputPath)
-
-  val partitionSet = AttributeSet(partitionColumns)
-  val dataColumns = query.output.filterNot(partitionSet.contains)
-
-  val queryExecution = 

spark git commit: [SPARK-17989][SQL] Check ascendingOrder type in sort_array function rather than throwing ClassCastException

2016-10-19 Thread rxin
Repository: spark
Updated Branches:
  refs/heads/branch-2.0 cdd2570e6 -> 995f602d2


[SPARK-17989][SQL] Check ascendingOrder type in sort_array function rather than 
throwing ClassCastException

## What changes were proposed in this pull request?

This PR proposes to check the second argument, `ascendingOrder`  rather than 
throwing `ClassCastException` exception message.

```sql
select sort_array(array('b', 'd'), '1');
```

**Before**

```
16/10/19 13:16:08 ERROR SparkSQLDriver: Failed in [select sort_array(array('b', 
'd'), '1')]
java.lang.ClassCastException: org.apache.spark.unsafe.types.UTF8String cannot 
be cast to java.lang.Boolean
at scala.runtime.BoxesRunTime.unboxToBoolean(BoxesRunTime.java:85)
at 
org.apache.spark.sql.catalyst.expressions.SortArray.nullSafeEval(collectionOperations.scala:185)
at 
org.apache.spark.sql.catalyst.expressions.BinaryExpression.eval(Expression.scala:416)
at 
org.apache.spark.sql.catalyst.optimizer.ConstantFolding$$anonfun$apply$1$$anonfun$applyOrElse$1.applyOrElse(expressions.scala:50)
at 
org.apache.spark.sql.catalyst.optimizer.ConstantFolding$$anonfun$apply$1$$anonfun$applyOrElse$1.applyOrElse(expressions.scala:43)
at 
org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$3.apply(TreeNode.scala:292)
at 
org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$3.apply(TreeNode.scala:292)
at 
org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:74)
at 
org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:291)
at 
org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformDown$1.apply(TreeNode.scala:297)
```

**After**

```
Error in query: cannot resolve 'sort_array(array('b', 'd'), '1')' due to data 
type mismatch: Sort order in second argument requires a boolean literal.; line 
1 pos 7;
```

## How was this patch tested?

Unit test in `DataFrameFunctionsSuite`.

Author: hyukjinkwon 

Closes #15532 from HyukjinKwon/SPARK-17989.

(cherry picked from commit 4b2011ec9da1245923b5cbd883240fef0dbf3ef0)
Signed-off-by: Reynold Xin 


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/995f602d
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/995f602d
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/995f602d

Branch: refs/heads/branch-2.0
Commit: 995f602d27bdcf9e6787d93dbea2357e6dc6ccaa
Parents: cdd2570
Author: hyukjinkwon 
Authored: Wed Oct 19 19:36:21 2016 -0700
Committer: Reynold Xin 
Committed: Wed Oct 19 19:36:53 2016 -0700

--
 .../expressions/collectionOperations.scala  |  8 +++-
 .../test/resources/sql-tests/inputs/array.sql   |  6 ++
 .../resources/sql-tests/results/array.sql.out   | 21 +---
 3 files changed, 31 insertions(+), 4 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/995f602d/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
--
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
index 2e8ea11..1efe2cb 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
@@ -112,7 +112,13 @@ case class SortArray(base: Expression, ascendingOrder: 
Expression)
 
   override def checkInputDataTypes(): TypeCheckResult = base.dataType match {
 case ArrayType(dt, _) if RowOrdering.isOrderable(dt) =>
-  TypeCheckResult.TypeCheckSuccess
+  ascendingOrder match {
+case Literal(_: Boolean, BooleanType) =>
+  TypeCheckResult.TypeCheckSuccess
+case _ =>
+  TypeCheckResult.TypeCheckFailure(
+"Sort order in second argument requires a boolean literal.")
+  }
 case ArrayType(dt, _) =>
   TypeCheckResult.TypeCheckFailure(
 s"$prettyName does not support sorting array of type 
${dt.simpleString}")

http://git-wip-us.apache.org/repos/asf/spark/blob/995f602d/sql/core/src/test/resources/sql-tests/inputs/array.sql
--
diff --git a/sql/core/src/test/resources/sql-tests/inputs/array.sql 
b/sql/core/src/test/resources/sql-tests/inputs/array.sql
index 4038a0d..984321a 100644
--- a/sql/core/src/test/resources/sql-tests/inputs/array.sql
+++ b/sql/core/src/test/resources/sql-tests/inputs/array.sql
@@ -71,6 +71,12 @@ select
   sort_array(timestamp_array)
 from 

spark git commit: [SPARK-17989][SQL] Check ascendingOrder type in sort_array function rather than throwing ClassCastException

2016-10-19 Thread rxin
Repository: spark
Updated Branches:
  refs/heads/master 444c2d22e -> 4b2011ec9


[SPARK-17989][SQL] Check ascendingOrder type in sort_array function rather than 
throwing ClassCastException

## What changes were proposed in this pull request?

This PR proposes to check the second argument, `ascendingOrder`  rather than 
throwing `ClassCastException` exception message.

```sql
select sort_array(array('b', 'd'), '1');
```

**Before**

```
16/10/19 13:16:08 ERROR SparkSQLDriver: Failed in [select sort_array(array('b', 
'd'), '1')]
java.lang.ClassCastException: org.apache.spark.unsafe.types.UTF8String cannot 
be cast to java.lang.Boolean
at scala.runtime.BoxesRunTime.unboxToBoolean(BoxesRunTime.java:85)
at 
org.apache.spark.sql.catalyst.expressions.SortArray.nullSafeEval(collectionOperations.scala:185)
at 
org.apache.spark.sql.catalyst.expressions.BinaryExpression.eval(Expression.scala:416)
at 
org.apache.spark.sql.catalyst.optimizer.ConstantFolding$$anonfun$apply$1$$anonfun$applyOrElse$1.applyOrElse(expressions.scala:50)
at 
org.apache.spark.sql.catalyst.optimizer.ConstantFolding$$anonfun$apply$1$$anonfun$applyOrElse$1.applyOrElse(expressions.scala:43)
at 
org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$3.apply(TreeNode.scala:292)
at 
org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$3.apply(TreeNode.scala:292)
at 
org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:74)
at 
org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:291)
at 
org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformDown$1.apply(TreeNode.scala:297)
```

**After**

```
Error in query: cannot resolve 'sort_array(array('b', 'd'), '1')' due to data 
type mismatch: Sort order in second argument requires a boolean literal.; line 
1 pos 7;
```

## How was this patch tested?

Unit test in `DataFrameFunctionsSuite`.

Author: hyukjinkwon 

Closes #15532 from HyukjinKwon/SPARK-17989.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/4b2011ec
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/4b2011ec
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/4b2011ec

Branch: refs/heads/master
Commit: 4b2011ec9da1245923b5cbd883240fef0dbf3ef0
Parents: 444c2d2
Author: hyukjinkwon 
Authored: Wed Oct 19 19:36:21 2016 -0700
Committer: Reynold Xin 
Committed: Wed Oct 19 19:36:21 2016 -0700

--
 .../expressions/collectionOperations.scala  |  8 +++-
 .../test/resources/sql-tests/inputs/array.sql   |  6 ++
 .../resources/sql-tests/results/array.sql.out   | 21 +---
 3 files changed, 31 insertions(+), 4 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/4b2011ec/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
--
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
index c020029..f56bb39 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
@@ -124,7 +124,13 @@ case class SortArray(base: Expression, ascendingOrder: 
Expression)
 
   override def checkInputDataTypes(): TypeCheckResult = base.dataType match {
 case ArrayType(dt, _) if RowOrdering.isOrderable(dt) =>
-  TypeCheckResult.TypeCheckSuccess
+  ascendingOrder match {
+case Literal(_: Boolean, BooleanType) =>
+  TypeCheckResult.TypeCheckSuccess
+case _ =>
+  TypeCheckResult.TypeCheckFailure(
+"Sort order in second argument requires a boolean literal.")
+  }
 case ArrayType(dt, _) =>
   TypeCheckResult.TypeCheckFailure(
 s"$prettyName does not support sorting array of type 
${dt.simpleString}")

http://git-wip-us.apache.org/repos/asf/spark/blob/4b2011ec/sql/core/src/test/resources/sql-tests/inputs/array.sql
--
diff --git a/sql/core/src/test/resources/sql-tests/inputs/array.sql 
b/sql/core/src/test/resources/sql-tests/inputs/array.sql
index 4038a0d..984321a 100644
--- a/sql/core/src/test/resources/sql-tests/inputs/array.sql
+++ b/sql/core/src/test/resources/sql-tests/inputs/array.sql
@@ -71,6 +71,12 @@ select
   sort_array(timestamp_array)
 from primitive_arrays;
 
+-- sort_array with an invalid string literal for the argument of sort order.
+select sort_array(array('b', 'd'), '1');

spark git commit: [SPARK-10541][WEB UI] Allow ApplicationHistoryProviders to provide their own text when there aren't any complete apps

2016-10-19 Thread vanzin
Repository: spark
Updated Branches:
  refs/heads/master 9540357ad -> 444c2d22e


[SPARK-10541][WEB UI] Allow ApplicationHistoryProviders to provide their own 
text when there aren't any complete apps

## What changes were proposed in this pull request?

I've added a method to `ApplicationHistoryProvider` that returns the html 
paragraph to display when there are no applications. This allows providers 
other than `FsHistoryProvider` to determine what is printed. The current hard 
coded text is now moved into `FsHistoryProvider` since it assumed that's what 
was being used before.

I chose to make the function return html rather than text because the current 
text block had inline html in it and it allows a new implementation of 
`ApplicationHistoryProvider` more versatility. I did not see any security 
issues with this since injecting html here requires implementing 
`ApplicationHistoryProvider` and can't be done outside of code.

## How was this patch tested?

Manual testing and dev/run-tests

No visible changes to the UI

Author: Alex Bozarth 

Closes #15490 from ajbozarth/spark10541.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/444c2d22
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/444c2d22
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/444c2d22

Branch: refs/heads/master
Commit: 444c2d22e38a8a78135adf0d3a3774f0e9fc866c
Parents: 9540357
Author: Alex Bozarth 
Authored: Wed Oct 19 13:01:33 2016 -0700
Committer: Marcelo Vanzin 
Committed: Wed Oct 19 13:01:33 2016 -0700

--
 .../deploy/history/ApplicationHistoryProvider.scala |  6 ++
 .../apache/spark/deploy/history/FsHistoryProvider.scala | 12 
 .../org/apache/spark/deploy/history/HistoryPage.scala   |  8 +---
 .../org/apache/spark/deploy/history/HistoryServer.scala |  8 
 4 files changed, 27 insertions(+), 7 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/444c2d22/core/src/main/scala/org/apache/spark/deploy/history/ApplicationHistoryProvider.scala
--
diff --git 
a/core/src/main/scala/org/apache/spark/deploy/history/ApplicationHistoryProvider.scala
 
b/core/src/main/scala/org/apache/spark/deploy/history/ApplicationHistoryProvider.scala
index ad7a097..06530ff 100644
--- 
a/core/src/main/scala/org/apache/spark/deploy/history/ApplicationHistoryProvider.scala
+++ 
b/core/src/main/scala/org/apache/spark/deploy/history/ApplicationHistoryProvider.scala
@@ -19,6 +19,8 @@ package org.apache.spark.deploy.history
 
 import java.util.zip.ZipOutputStream
 
+import scala.xml.Node
+
 import org.apache.spark.SparkException
 import org.apache.spark.ui.SparkUI
 
@@ -114,4 +116,8 @@ private[history] abstract class ApplicationHistoryProvider {
*/
   def getApplicationInfo(appId: String): Option[ApplicationHistoryInfo]
 
+  /**
+   * @return html text to display when the application list is empty
+   */
+  def getEmptyListingHtml(): Seq[Node] = Seq.empty
 }

http://git-wip-us.apache.org/repos/asf/spark/blob/444c2d22/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala
--
diff --git 
a/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala 
b/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala
index 3c2d169..530cc52 100644
--- 
a/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala
+++ 
b/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala
@@ -23,6 +23,7 @@ import java.util.concurrent.{Executors, ExecutorService, 
TimeUnit}
 import java.util.zip.{ZipEntry, ZipOutputStream}
 
 import scala.collection.mutable
+import scala.xml.Node
 
 import com.google.common.io.ByteStreams
 import com.google.common.util.concurrent.{MoreExecutors, ThreadFactoryBuilder}
@@ -262,6 +263,17 @@ private[history] class FsHistoryProvider(conf: SparkConf, 
clock: Clock)
 }
   }
 
+  override def getEmptyListingHtml(): Seq[Node] = {
+
+  Did you specify the correct logging directory? Please verify your 
setting of
+  spark.history.fs.logDirectory
+  listed above and whether you have the permissions to access it.
+  
+  It is also possible that your application did not run to
+  completion or did not stop the SparkContext.
+
+  }
+
   override def getConfig(): Map[String, String] = {
 val safeMode = if (isFsInSafeMode()) {
   Map("HDFS State" -> "In safe mode, application logs not available.")

http://git-wip-us.apache.org/repos/asf/spark/blob/444c2d22/core/src/main/scala/org/apache/spark/deploy/history/HistoryPage.scala

spark git commit: [SPARK-17985][CORE] Bump commons-lang3 version to 3.5.

2016-10-19 Thread srowen
Repository: spark
Updated Branches:
  refs/heads/master f39852e59 -> 9540357ad


[SPARK-17985][CORE] Bump commons-lang3 version to 3.5.

## What changes were proposed in this pull request?

`SerializationUtils.clone()` of commons-lang3 (<3.5) has a bug that breaks 
thread safety, which gets stack sometimes caused by race condition of 
initializing hash map.
See https://issues.apache.org/jira/browse/LANG-1251.

## How was this patch tested?

Existing tests.

Author: Takuya UESHIN 

Closes #15548 from ueshin/issues/SPARK-17985.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/9540357a
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/9540357a
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/9540357a

Branch: refs/heads/master
Commit: 9540357ada7df1acfefa7b775c82675cd475244c
Parents: f39852e
Author: Takuya UESHIN 
Authored: Wed Oct 19 10:06:43 2016 +0100
Committer: Sean Owen 
Committed: Wed Oct 19 10:06:43 2016 +0100

--
 dev/deps/spark-deps-hadoop-2.2  | 2 +-
 dev/deps/spark-deps-hadoop-2.3  | 2 +-
 dev/deps/spark-deps-hadoop-2.4  | 2 +-
 dev/deps/spark-deps-hadoop-2.6  | 2 +-
 dev/deps/spark-deps-hadoop-2.7  | 2 +-
 docs/streaming-flume-integration.md | 4 ++--
 pom.xml | 2 +-
 7 files changed, 8 insertions(+), 8 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/9540357a/dev/deps/spark-deps-hadoop-2.2
--
diff --git a/dev/deps/spark-deps-hadoop-2.2 b/dev/deps/spark-deps-hadoop-2.2
index b30f8c3..525dcef 100644
--- a/dev/deps/spark-deps-hadoop-2.2
+++ b/dev/deps/spark-deps-hadoop-2.2
@@ -33,7 +33,7 @@ commons-digester-1.8.jar
 commons-httpclient-3.1.jar
 commons-io-2.4.jar
 commons-lang-2.6.jar
-commons-lang3-3.3.2.jar
+commons-lang3-3.5.jar
 commons-logging-1.1.3.jar
 commons-math-2.1.jar
 commons-math3-3.4.1.jar

http://git-wip-us.apache.org/repos/asf/spark/blob/9540357a/dev/deps/spark-deps-hadoop-2.3
--
diff --git a/dev/deps/spark-deps-hadoop-2.3 b/dev/deps/spark-deps-hadoop-2.3
index 5b3a765..562fe64 100644
--- a/dev/deps/spark-deps-hadoop-2.3
+++ b/dev/deps/spark-deps-hadoop-2.3
@@ -36,7 +36,7 @@ commons-digester-1.8.jar
 commons-httpclient-3.1.jar
 commons-io-2.4.jar
 commons-lang-2.6.jar
-commons-lang3-3.3.2.jar
+commons-lang3-3.5.jar
 commons-logging-1.1.3.jar
 commons-math3-3.4.1.jar
 commons-net-2.2.jar

http://git-wip-us.apache.org/repos/asf/spark/blob/9540357a/dev/deps/spark-deps-hadoop-2.4
--
diff --git a/dev/deps/spark-deps-hadoop-2.4 b/dev/deps/spark-deps-hadoop-2.4
index e323efe..747521a 100644
--- a/dev/deps/spark-deps-hadoop-2.4
+++ b/dev/deps/spark-deps-hadoop-2.4
@@ -36,7 +36,7 @@ commons-digester-1.8.jar
 commons-httpclient-3.1.jar
 commons-io-2.4.jar
 commons-lang-2.6.jar
-commons-lang3-3.3.2.jar
+commons-lang3-3.5.jar
 commons-logging-1.1.3.jar
 commons-math3-3.4.1.jar
 commons-net-2.2.jar

http://git-wip-us.apache.org/repos/asf/spark/blob/9540357a/dev/deps/spark-deps-hadoop-2.6
--
diff --git a/dev/deps/spark-deps-hadoop-2.6 b/dev/deps/spark-deps-hadoop-2.6
index 77d97e5..afd4502 100644
--- a/dev/deps/spark-deps-hadoop-2.6
+++ b/dev/deps/spark-deps-hadoop-2.6
@@ -40,7 +40,7 @@ commons-digester-1.8.jar
 commons-httpclient-3.1.jar
 commons-io-2.4.jar
 commons-lang-2.6.jar
-commons-lang3-3.3.2.jar
+commons-lang3-3.5.jar
 commons-logging-1.1.3.jar
 commons-math3-3.4.1.jar
 commons-net-2.2.jar

http://git-wip-us.apache.org/repos/asf/spark/blob/9540357a/dev/deps/spark-deps-hadoop-2.7
--
diff --git a/dev/deps/spark-deps-hadoop-2.7 b/dev/deps/spark-deps-hadoop-2.7
index 572edfa..687b855 100644
--- a/dev/deps/spark-deps-hadoop-2.7
+++ b/dev/deps/spark-deps-hadoop-2.7
@@ -40,7 +40,7 @@ commons-digester-1.8.jar
 commons-httpclient-3.1.jar
 commons-io-2.4.jar
 commons-lang-2.6.jar
-commons-lang3-3.3.2.jar
+commons-lang3-3.5.jar
 commons-logging-1.1.3.jar
 commons-math3-3.4.1.jar
 commons-net-2.2.jar

http://git-wip-us.apache.org/repos/asf/spark/blob/9540357a/docs/streaming-flume-integration.md
--
diff --git a/docs/streaming-flume-integration.md 
b/docs/streaming-flume-integration.md
index 767e1f9..a5d36da 100644
--- a/docs/streaming-flume-integration.md
+++ b/docs/streaming-flume-integration.md
@@ -115,11 +115,11 @@ Configuring Flume on the chosen machine requires the 
following two steps.
artifactId = scala-library