date:20201014

[spark] branch master updated (ec34a00 -> 77a8efb)

2020-10-14 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from ec34a00  [SPARK-33153][SQL][TESTS] Ignore Spark 2.4 in 
HiveExternalCatalogVersionsSuite on Python 3.8/3.9
 add 77a8efb  [SPARK-32932][SQL] Do not use local shuffle reader at final 
stage on write command

No new revisions were added by this update.

Summary of changes:
 .../execution/adaptive/AdaptiveSparkPlanExec.scala | 14 +-
 .../adaptive/AdaptiveQueryExecSuite.scala  | 51 +-
 2 files changed, 63 insertions(+), 2 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (ec34a00 -> 77a8efb)

2020-10-14 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from ec34a00  [SPARK-33153][SQL][TESTS] Ignore Spark 2.4 in 
HiveExternalCatalogVersionsSuite on Python 3.8/3.9
 add 77a8efb  [SPARK-32932][SQL] Do not use local shuffle reader at final 
stage on write command

No new revisions were added by this update.

Summary of changes:
 .../execution/adaptive/AdaptiveSparkPlanExec.scala | 14 +-
 .../adaptive/AdaptiveQueryExecSuite.scala  | 51 +-
 2 files changed, 63 insertions(+), 2 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: Revert "[SPARK-33146][CORE] Check for non-fatal errors when loading new applications in SHS"

2020-10-14 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new e40c147  Revert "[SPARK-33146][CORE] Check for non-fatal errors when 
loading new applications in SHS"
e40c147 is described below

commit e40c147a5d194adbba13f12590959dc68347ec14
Author: Dongjoon Hyun 
AuthorDate: Wed Oct 14 21:47:46 2020 -0700

Revert "[SPARK-33146][CORE] Check for non-fatal errors when loading new 
applications in SHS"

This reverts commit d9669bdf0ff4ed9951d7077b8dc9ad94507615c5.
---
 .../spark/deploy/history/FsHistoryProvider.scala   |  3 --
 .../deploy/history/FsHistoryProviderSuite.scala| 49 --
 2 files changed, 52 deletions(-)

diff --git 
a/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala 
b/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala
index 5970708..c262152 100644
--- 
a/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala
+++ 
b/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala
@@ -526,9 +526,6 @@ private[history] class FsHistoryProvider(conf: SparkConf, 
clock: Clock)
 reader.fileSizeForLastIndex > 0
   } catch {
 case _: FileNotFoundException => false
-case NonFatal(e) =>
-  logWarning(s"Error while reading new log 
${reader.rootPath}", e)
-  false
   }
 
 case _: FileNotFoundException =>
diff --git 
a/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala
 
b/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala
index f3beb35..c2f34fc 100644
--- 
a/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala
+++ 
b/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala
@@ -1470,55 +1470,6 @@ class FsHistoryProviderSuite extends SparkFunSuite with 
Matchers with Logging {
 }
   }
 
-  test("SPARK-33146: don't let one bad rolling log folder prevent loading 
other applications") {
-withTempDir { dir =>
-  val conf = createTestConf(true)
-  conf.set(HISTORY_LOG_DIR, dir.getAbsolutePath)
-  val hadoopConf = SparkHadoopUtil.newConfiguration(conf)
-  val fs = new Path(dir.getAbsolutePath).getFileSystem(hadoopConf)
-
-  val provider = new FsHistoryProvider(conf)
-
-  val writer = new RollingEventLogFilesWriter("app", None, dir.toURI, 
conf, hadoopConf)
-  writer.start()
-
-  writeEventsToRollingWriter(writer, Seq(
-SparkListenerApplicationStart("app", Some("app"), 0, "user", None),
-SparkListenerJobStart(1, 0, Seq.empty)), rollFile = false)
-  provider.checkForLogs()
-  provider.cleanLogs()
-  assert(dir.listFiles().size === 1)
-  assert(provider.getListing.length === 1)
-
-  // Manually delete the appstatus file to make an invalid rolling event 
log
-  val appStatusPath = RollingEventLogFilesWriter.getAppStatusFilePath(new 
Path(writer.logPath),
-"app", None, true)
-  fs.delete(appStatusPath, false)
-  provider.checkForLogs()
-  provider.cleanLogs()
-  assert(provider.getListing.length === 0)
-
-  // Create a new application
-  val writer2 = new RollingEventLogFilesWriter("app2", None, dir.toURI, 
conf, hadoopConf)
-  writer2.start()
-  writeEventsToRollingWriter(writer2, Seq(
-SparkListenerApplicationStart("app2", Some("app2"), 0, "user", None),
-SparkListenerJobStart(1, 0, Seq.empty)), rollFile = false)
-
-  // Both folders exist but only one application found
-  provider.checkForLogs()
-  provider.cleanLogs()
-  assert(provider.getListing.length === 1)
-  assert(dir.listFiles().size === 2)
-
-  // Make sure a new provider sees the valid application
-  provider.stop()
-  val newProvider = new FsHistoryProvider(conf)
-  newProvider.checkForLogs()
-  assert(newProvider.getListing.length === 1)
-}
-  }
-
   /**
* Asks the provider to check for logs and calls a function to perform 
checks on the updated
* app list. Example:


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: Revert "[SPARK-33146][CORE] Check for non-fatal errors when loading new applications in SHS"

2020-10-14 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new e40c147  Revert "[SPARK-33146][CORE] Check for non-fatal errors when 
loading new applications in SHS"
e40c147 is described below

commit e40c147a5d194adbba13f12590959dc68347ec14
Author: Dongjoon Hyun 
AuthorDate: Wed Oct 14 21:47:46 2020 -0700

Revert "[SPARK-33146][CORE] Check for non-fatal errors when loading new 
applications in SHS"

This reverts commit d9669bdf0ff4ed9951d7077b8dc9ad94507615c5.
---
 .../spark/deploy/history/FsHistoryProvider.scala   |  3 --
 .../deploy/history/FsHistoryProviderSuite.scala| 49 --
 2 files changed, 52 deletions(-)

diff --git 
a/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala 
b/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala
index 5970708..c262152 100644
--- 
a/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala
+++ 
b/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala
@@ -526,9 +526,6 @@ private[history] class FsHistoryProvider(conf: SparkConf, 
clock: Clock)
 reader.fileSizeForLastIndex > 0
   } catch {
 case _: FileNotFoundException => false
-case NonFatal(e) =>
-  logWarning(s"Error while reading new log 
${reader.rootPath}", e)
-  false
   }
 
 case _: FileNotFoundException =>
diff --git 
a/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala
 
b/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala
index f3beb35..c2f34fc 100644
--- 
a/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala
+++ 
b/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala
@@ -1470,55 +1470,6 @@ class FsHistoryProviderSuite extends SparkFunSuite with 
Matchers with Logging {
 }
   }
 
-  test("SPARK-33146: don't let one bad rolling log folder prevent loading 
other applications") {
-withTempDir { dir =>
-  val conf = createTestConf(true)
-  conf.set(HISTORY_LOG_DIR, dir.getAbsolutePath)
-  val hadoopConf = SparkHadoopUtil.newConfiguration(conf)
-  val fs = new Path(dir.getAbsolutePath).getFileSystem(hadoopConf)
-
-  val provider = new FsHistoryProvider(conf)
-
-  val writer = new RollingEventLogFilesWriter("app", None, dir.toURI, 
conf, hadoopConf)
-  writer.start()
-
-  writeEventsToRollingWriter(writer, Seq(
-SparkListenerApplicationStart("app", Some("app"), 0, "user", None),
-SparkListenerJobStart(1, 0, Seq.empty)), rollFile = false)
-  provider.checkForLogs()
-  provider.cleanLogs()
-  assert(dir.listFiles().size === 1)
-  assert(provider.getListing.length === 1)
-
-  // Manually delete the appstatus file to make an invalid rolling event 
log
-  val appStatusPath = RollingEventLogFilesWriter.getAppStatusFilePath(new 
Path(writer.logPath),
-"app", None, true)
-  fs.delete(appStatusPath, false)
-  provider.checkForLogs()
-  provider.cleanLogs()
-  assert(provider.getListing.length === 0)
-
-  // Create a new application
-  val writer2 = new RollingEventLogFilesWriter("app2", None, dir.toURI, 
conf, hadoopConf)
-  writer2.start()
-  writeEventsToRollingWriter(writer2, Seq(
-SparkListenerApplicationStart("app2", Some("app2"), 0, "user", None),
-SparkListenerJobStart(1, 0, Seq.empty)), rollFile = false)
-
-  // Both folders exist but only one application found
-  provider.checkForLogs()
-  provider.cleanLogs()
-  assert(provider.getListing.length === 1)
-  assert(dir.listFiles().size === 2)
-
-  // Make sure a new provider sees the valid application
-  provider.stop()
-  val newProvider = new FsHistoryProvider(conf)
-  newProvider.checkForLogs()
-  assert(newProvider.getListing.length === 1)
-}
-  }
-
   /**
* Asks the provider to check for logs and calls a function to perform 
checks on the updated
* app list. Example:


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: Revert "[SPARK-33146][CORE] Check for non-fatal errors when loading new applications in SHS"

2020-10-14 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new e40c147  Revert "[SPARK-33146][CORE] Check for non-fatal errors when 
loading new applications in SHS"
e40c147 is described below

commit e40c147a5d194adbba13f12590959dc68347ec14
Author: Dongjoon Hyun 
AuthorDate: Wed Oct 14 21:47:46 2020 -0700

Revert "[SPARK-33146][CORE] Check for non-fatal errors when loading new 
applications in SHS"

This reverts commit d9669bdf0ff4ed9951d7077b8dc9ad94507615c5.
---
 .../spark/deploy/history/FsHistoryProvider.scala   |  3 --
 .../deploy/history/FsHistoryProviderSuite.scala| 49 --
 2 files changed, 52 deletions(-)

diff --git 
a/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala 
b/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala
index 5970708..c262152 100644
--- 
a/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala
+++ 
b/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala
@@ -526,9 +526,6 @@ private[history] class FsHistoryProvider(conf: SparkConf, 
clock: Clock)
 reader.fileSizeForLastIndex > 0
   } catch {
 case _: FileNotFoundException => false
-case NonFatal(e) =>
-  logWarning(s"Error while reading new log 
${reader.rootPath}", e)
-  false
   }
 
 case _: FileNotFoundException =>
diff --git 
a/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala
 
b/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala
index f3beb35..c2f34fc 100644
--- 
a/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala
+++ 
b/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala
@@ -1470,55 +1470,6 @@ class FsHistoryProviderSuite extends SparkFunSuite with 
Matchers with Logging {
 }
   }
 
-  test("SPARK-33146: don't let one bad rolling log folder prevent loading 
other applications") {
-withTempDir { dir =>
-  val conf = createTestConf(true)
-  conf.set(HISTORY_LOG_DIR, dir.getAbsolutePath)
-  val hadoopConf = SparkHadoopUtil.newConfiguration(conf)
-  val fs = new Path(dir.getAbsolutePath).getFileSystem(hadoopConf)
-
-  val provider = new FsHistoryProvider(conf)
-
-  val writer = new RollingEventLogFilesWriter("app", None, dir.toURI, 
conf, hadoopConf)
-  writer.start()
-
-  writeEventsToRollingWriter(writer, Seq(
-SparkListenerApplicationStart("app", Some("app"), 0, "user", None),
-SparkListenerJobStart(1, 0, Seq.empty)), rollFile = false)
-  provider.checkForLogs()
-  provider.cleanLogs()
-  assert(dir.listFiles().size === 1)
-  assert(provider.getListing.length === 1)
-
-  // Manually delete the appstatus file to make an invalid rolling event 
log
-  val appStatusPath = RollingEventLogFilesWriter.getAppStatusFilePath(new 
Path(writer.logPath),
-"app", None, true)
-  fs.delete(appStatusPath, false)
-  provider.checkForLogs()
-  provider.cleanLogs()
-  assert(provider.getListing.length === 0)
-
-  // Create a new application
-  val writer2 = new RollingEventLogFilesWriter("app2", None, dir.toURI, 
conf, hadoopConf)
-  writer2.start()
-  writeEventsToRollingWriter(writer2, Seq(
-SparkListenerApplicationStart("app2", Some("app2"), 0, "user", None),
-SparkListenerJobStart(1, 0, Seq.empty)), rollFile = false)
-
-  // Both folders exist but only one application found
-  provider.checkForLogs()
-  provider.cleanLogs()
-  assert(provider.getListing.length === 1)
-  assert(dir.listFiles().size === 2)
-
-  // Make sure a new provider sees the valid application
-  provider.stop()
-  val newProvider = new FsHistoryProvider(conf)
-  newProvider.checkForLogs()
-  assert(newProvider.getListing.length === 1)
-}
-  }
-
   /**
* Asks the provider to check for logs and calls a function to perform 
checks on the updated
* app list. Example:


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: Revert "[SPARK-33146][CORE] Check for non-fatal errors when loading new applications in SHS"

2020-10-14 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new e40c147  Revert "[SPARK-33146][CORE] Check for non-fatal errors when 
loading new applications in SHS"
e40c147 is described below

commit e40c147a5d194adbba13f12590959dc68347ec14
Author: Dongjoon Hyun 
AuthorDate: Wed Oct 14 21:47:46 2020 -0700

Revert "[SPARK-33146][CORE] Check for non-fatal errors when loading new 
applications in SHS"

This reverts commit d9669bdf0ff4ed9951d7077b8dc9ad94507615c5.
---
 .../spark/deploy/history/FsHistoryProvider.scala   |  3 --
 .../deploy/history/FsHistoryProviderSuite.scala| 49 --
 2 files changed, 52 deletions(-)

diff --git 
a/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala 
b/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala
index 5970708..c262152 100644
--- 
a/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala
+++ 
b/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala
@@ -526,9 +526,6 @@ private[history] class FsHistoryProvider(conf: SparkConf, 
clock: Clock)
 reader.fileSizeForLastIndex > 0
   } catch {
 case _: FileNotFoundException => false
-case NonFatal(e) =>
-  logWarning(s"Error while reading new log 
${reader.rootPath}", e)
-  false
   }
 
 case _: FileNotFoundException =>
diff --git 
a/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala
 
b/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala
index f3beb35..c2f34fc 100644
--- 
a/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala
+++ 
b/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala
@@ -1470,55 +1470,6 @@ class FsHistoryProviderSuite extends SparkFunSuite with 
Matchers with Logging {
 }
   }
 
-  test("SPARK-33146: don't let one bad rolling log folder prevent loading 
other applications") {
-withTempDir { dir =>
-  val conf = createTestConf(true)
-  conf.set(HISTORY_LOG_DIR, dir.getAbsolutePath)
-  val hadoopConf = SparkHadoopUtil.newConfiguration(conf)
-  val fs = new Path(dir.getAbsolutePath).getFileSystem(hadoopConf)
-
-  val provider = new FsHistoryProvider(conf)
-
-  val writer = new RollingEventLogFilesWriter("app", None, dir.toURI, 
conf, hadoopConf)
-  writer.start()
-
-  writeEventsToRollingWriter(writer, Seq(
-SparkListenerApplicationStart("app", Some("app"), 0, "user", None),
-SparkListenerJobStart(1, 0, Seq.empty)), rollFile = false)
-  provider.checkForLogs()
-  provider.cleanLogs()
-  assert(dir.listFiles().size === 1)
-  assert(provider.getListing.length === 1)
-
-  // Manually delete the appstatus file to make an invalid rolling event 
log
-  val appStatusPath = RollingEventLogFilesWriter.getAppStatusFilePath(new 
Path(writer.logPath),
-"app", None, true)
-  fs.delete(appStatusPath, false)
-  provider.checkForLogs()
-  provider.cleanLogs()
-  assert(provider.getListing.length === 0)
-
-  // Create a new application
-  val writer2 = new RollingEventLogFilesWriter("app2", None, dir.toURI, 
conf, hadoopConf)
-  writer2.start()
-  writeEventsToRollingWriter(writer2, Seq(
-SparkListenerApplicationStart("app2", Some("app2"), 0, "user", None),
-SparkListenerJobStart(1, 0, Seq.empty)), rollFile = false)
-
-  // Both folders exist but only one application found
-  provider.checkForLogs()
-  provider.cleanLogs()
-  assert(provider.getListing.length === 1)
-  assert(dir.listFiles().size === 2)
-
-  // Make sure a new provider sees the valid application
-  provider.stop()
-  val newProvider = new FsHistoryProvider(conf)
-  newProvider.checkForLogs()
-  assert(newProvider.getListing.length === 1)
-}
-  }
-
   /**
* Asks the provider to check for logs and calls a function to perform 
checks on the updated
* app list. Example:


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: Revert "[SPARK-33146][CORE] Check for non-fatal errors when loading new applications in SHS"

2020-10-14 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new e40c147  Revert "[SPARK-33146][CORE] Check for non-fatal errors when 
loading new applications in SHS"
e40c147 is described below

commit e40c147a5d194adbba13f12590959dc68347ec14
Author: Dongjoon Hyun 
AuthorDate: Wed Oct 14 21:47:46 2020 -0700

Revert "[SPARK-33146][CORE] Check for non-fatal errors when loading new 
applications in SHS"

This reverts commit d9669bdf0ff4ed9951d7077b8dc9ad94507615c5.
---
 .../spark/deploy/history/FsHistoryProvider.scala   |  3 --
 .../deploy/history/FsHistoryProviderSuite.scala| 49 --
 2 files changed, 52 deletions(-)

diff --git 
a/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala 
b/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala
index 5970708..c262152 100644
--- 
a/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala
+++ 
b/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala
@@ -526,9 +526,6 @@ private[history] class FsHistoryProvider(conf: SparkConf, 
clock: Clock)
 reader.fileSizeForLastIndex > 0
   } catch {
 case _: FileNotFoundException => false
-case NonFatal(e) =>
-  logWarning(s"Error while reading new log 
${reader.rootPath}", e)
-  false
   }
 
 case _: FileNotFoundException =>
diff --git 
a/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala
 
b/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala
index f3beb35..c2f34fc 100644
--- 
a/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala
+++ 
b/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala
@@ -1470,55 +1470,6 @@ class FsHistoryProviderSuite extends SparkFunSuite with 
Matchers with Logging {
 }
   }
 
-  test("SPARK-33146: don't let one bad rolling log folder prevent loading 
other applications") {
-withTempDir { dir =>
-  val conf = createTestConf(true)
-  conf.set(HISTORY_LOG_DIR, dir.getAbsolutePath)
-  val hadoopConf = SparkHadoopUtil.newConfiguration(conf)
-  val fs = new Path(dir.getAbsolutePath).getFileSystem(hadoopConf)
-
-  val provider = new FsHistoryProvider(conf)
-
-  val writer = new RollingEventLogFilesWriter("app", None, dir.toURI, 
conf, hadoopConf)
-  writer.start()
-
-  writeEventsToRollingWriter(writer, Seq(
-SparkListenerApplicationStart("app", Some("app"), 0, "user", None),
-SparkListenerJobStart(1, 0, Seq.empty)), rollFile = false)
-  provider.checkForLogs()
-  provider.cleanLogs()
-  assert(dir.listFiles().size === 1)
-  assert(provider.getListing.length === 1)
-
-  // Manually delete the appstatus file to make an invalid rolling event 
log
-  val appStatusPath = RollingEventLogFilesWriter.getAppStatusFilePath(new 
Path(writer.logPath),
-"app", None, true)
-  fs.delete(appStatusPath, false)
-  provider.checkForLogs()
-  provider.cleanLogs()
-  assert(provider.getListing.length === 0)
-
-  // Create a new application
-  val writer2 = new RollingEventLogFilesWriter("app2", None, dir.toURI, 
conf, hadoopConf)
-  writer2.start()
-  writeEventsToRollingWriter(writer2, Seq(
-SparkListenerApplicationStart("app2", Some("app2"), 0, "user", None),
-SparkListenerJobStart(1, 0, Seq.empty)), rollFile = false)
-
-  // Both folders exist but only one application found
-  provider.checkForLogs()
-  provider.cleanLogs()
-  assert(provider.getListing.length === 1)
-  assert(dir.listFiles().size === 2)
-
-  // Make sure a new provider sees the valid application
-  provider.stop()
-  val newProvider = new FsHistoryProvider(conf)
-  newProvider.checkForLogs()
-  assert(newProvider.getListing.length === 1)
-}
-  }
-
   /**
* Asks the provider to check for logs and calls a function to perform 
checks on the updated
* app list. Example:


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-33153][SQL][TESTS] Ignore Spark 2.4 in HiveExternalCatalogVersionsSuite on Python 3.8/3.9

2020-10-14 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 0b7b811  [SPARK-33153][SQL][TESTS] Ignore Spark 2.4 in 
HiveExternalCatalogVersionsSuite on Python 3.8/3.9
0b7b811 is described below

commit 0b7b811c6464ac8a0fe5230dc49aefc2f5507db8
Author: Dongjoon Hyun 
AuthorDate: Wed Oct 14 20:48:13 2020 -0700

[SPARK-33153][SQL][TESTS] Ignore Spark 2.4 in 
HiveExternalCatalogVersionsSuite on Python 3.8/3.9

### What changes were proposed in this pull request?

This PR aims to ignore Apache Spark 2.4.x distribution in 
HiveExternalCatalogVersionsSuite if Python version is 3.8 or 3.9.

### Why are the changes needed?

Currently, `HiveExternalCatalogVersionsSuite` is broken on the latest OS 
like `Ubuntu 20.04` because its default Python version is 3.8. PySpark 2.4.x 
doesn't work on Python 3.8 due to SPARK-29536.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Manually.
```
$ python3 --version
Python 3.8.5

$ build/sbt "hive/testOnly *.HiveExternalCatalogVersionsSuite"
...
[info] All tests passed.
[info] Passed: Total 1, Failed 0, Errors 0, Passed 1
```

Closes #30044 from dongjoon-hyun/SPARK-33153.

Authored-by: Dongjoon Hyun 
Signed-off-by: Dongjoon Hyun 
(cherry picked from commit ec34a001ad0ef57a496f29a6523d905128875b17)
Signed-off-by: Dongjoon Hyun 
---
 core/src/main/scala/org/apache/spark/TestUtils.scala| 13 +
 .../spark/sql/hive/HiveExternalCatalogVersionsSuite.scala   |  3 ++-
 2 files changed, 15 insertions(+), 1 deletion(-)

diff --git a/core/src/main/scala/org/apache/spark/TestUtils.scala 
b/core/src/main/scala/org/apache/spark/TestUtils.scala
index 1e00769..054e7b0 100644
--- a/core/src/main/scala/org/apache/spark/TestUtils.scala
+++ b/core/src/main/scala/org/apache/spark/TestUtils.scala
@@ -249,6 +249,19 @@ private[spark] object TestUtils {
 attempt.isSuccess && attempt.get == 0
   }
 
+  def isPythonVersionAtLeast38(): Boolean = {
+val attempt = if (Utils.isWindows) {
+  Try(Process(Seq("cmd.exe", "/C", "python3 --version"))
+.run(ProcessLogger(s => s.startsWith("Python 3.8") || 
s.startsWith("Python 3.9")))
+.exitValue())
+} else {
+  Try(Process(Seq("sh", "-c", "python3 --version"))
+.run(ProcessLogger(s => s.startsWith("Python 3.8") || 
s.startsWith("Python 3.9")))
+.exitValue())
+}
+attempt.isSuccess && attempt.get == 0
+  }
+
   /**
* Returns the response code from an HTTP(S) URL.
*/
diff --git 
a/sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveExternalCatalogVersionsSuite.scala
 
b/sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveExternalCatalogVersionsSuite.scala
index cbfdb7f..b81b7e8 100644
--- 
a/sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveExternalCatalogVersionsSuite.scala
+++ 
b/sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveExternalCatalogVersionsSuite.scala
@@ -234,7 +234,7 @@ object PROCESS_TABLES extends QueryTest with SQLTestUtils {
   // Tests the latest version of every release line.
   val testingVersions: Seq[String] = {
 import scala.io.Source
-try {
+val versions: Seq[String] = try {
   Source.fromURL(s"${releaseMirror}/spark").mkString
 .split("\n")
 .filter(_.contains(""" Seq("3.0.1", "2.4.7") // A temporary fallback to use 
a specific version
 }
+versions.filter(v => v.startsWith("3") || 
!TestUtils.isPythonVersionAtLeast38())
   }
 
   protected var spark: SparkSession = _


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-33153][SQL][TESTS] Ignore Spark 2.4 in HiveExternalCatalogVersionsSuite on Python 3.8/3.9

2020-10-14 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 0b7b811  [SPARK-33153][SQL][TESTS] Ignore Spark 2.4 in 
HiveExternalCatalogVersionsSuite on Python 3.8/3.9
0b7b811 is described below

commit 0b7b811c6464ac8a0fe5230dc49aefc2f5507db8
Author: Dongjoon Hyun 
AuthorDate: Wed Oct 14 20:48:13 2020 -0700

[SPARK-33153][SQL][TESTS] Ignore Spark 2.4 in 
HiveExternalCatalogVersionsSuite on Python 3.8/3.9

### What changes were proposed in this pull request?

This PR aims to ignore Apache Spark 2.4.x distribution in 
HiveExternalCatalogVersionsSuite if Python version is 3.8 or 3.9.

### Why are the changes needed?

Currently, `HiveExternalCatalogVersionsSuite` is broken on the latest OS 
like `Ubuntu 20.04` because its default Python version is 3.8. PySpark 2.4.x 
doesn't work on Python 3.8 due to SPARK-29536.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Manually.
```
$ python3 --version
Python 3.8.5

$ build/sbt "hive/testOnly *.HiveExternalCatalogVersionsSuite"
...
[info] All tests passed.
[info] Passed: Total 1, Failed 0, Errors 0, Passed 1
```

Closes #30044 from dongjoon-hyun/SPARK-33153.

Authored-by: Dongjoon Hyun 
Signed-off-by: Dongjoon Hyun 
(cherry picked from commit ec34a001ad0ef57a496f29a6523d905128875b17)
Signed-off-by: Dongjoon Hyun 
---
 core/src/main/scala/org/apache/spark/TestUtils.scala| 13 +
 .../spark/sql/hive/HiveExternalCatalogVersionsSuite.scala   |  3 ++-
 2 files changed, 15 insertions(+), 1 deletion(-)

diff --git a/core/src/main/scala/org/apache/spark/TestUtils.scala 
b/core/src/main/scala/org/apache/spark/TestUtils.scala
index 1e00769..054e7b0 100644
--- a/core/src/main/scala/org/apache/spark/TestUtils.scala
+++ b/core/src/main/scala/org/apache/spark/TestUtils.scala
@@ -249,6 +249,19 @@ private[spark] object TestUtils {
 attempt.isSuccess && attempt.get == 0
   }
 
+  def isPythonVersionAtLeast38(): Boolean = {
+val attempt = if (Utils.isWindows) {
+  Try(Process(Seq("cmd.exe", "/C", "python3 --version"))
+.run(ProcessLogger(s => s.startsWith("Python 3.8") || 
s.startsWith("Python 3.9")))
+.exitValue())
+} else {
+  Try(Process(Seq("sh", "-c", "python3 --version"))
+.run(ProcessLogger(s => s.startsWith("Python 3.8") || 
s.startsWith("Python 3.9")))
+.exitValue())
+}
+attempt.isSuccess && attempt.get == 0
+  }
+
   /**
* Returns the response code from an HTTP(S) URL.
*/
diff --git 
a/sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveExternalCatalogVersionsSuite.scala
 
b/sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveExternalCatalogVersionsSuite.scala
index cbfdb7f..b81b7e8 100644
--- 
a/sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveExternalCatalogVersionsSuite.scala
+++ 
b/sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveExternalCatalogVersionsSuite.scala
@@ -234,7 +234,7 @@ object PROCESS_TABLES extends QueryTest with SQLTestUtils {
   // Tests the latest version of every release line.
   val testingVersions: Seq[String] = {
 import scala.io.Source
-try {
+val versions: Seq[String] = try {
   Source.fromURL(s"${releaseMirror}/spark").mkString
 .split("\n")
 .filter(_.contains(""" Seq("3.0.1", "2.4.7") // A temporary fallback to use 
a specific version
 }
+versions.filter(v => v.startsWith("3") || 
!TestUtils.isPythonVersionAtLeast38())
   }
 
   protected var spark: SparkSession = _


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (9ab0ec4 -> ec34a00)

2020-10-14 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 9ab0ec4  [SPARK-33146][CORE] Check for non-fatal errors when loading 
new applications in SHS
 add ec34a00  [SPARK-33153][SQL][TESTS] Ignore Spark 2.4 in 
HiveExternalCatalogVersionsSuite on Python 3.8/3.9

No new revisions were added by this update.

Summary of changes:
 core/src/main/scala/org/apache/spark/TestUtils.scala| 13 +
 .../spark/sql/hive/HiveExternalCatalogVersionsSuite.scala   |  3 ++-
 2 files changed, 15 insertions(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-33153][SQL][TESTS] Ignore Spark 2.4 in HiveExternalCatalogVersionsSuite on Python 3.8/3.9

2020-10-14 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 0b7b811  [SPARK-33153][SQL][TESTS] Ignore Spark 2.4 in 
HiveExternalCatalogVersionsSuite on Python 3.8/3.9
0b7b811 is described below

commit 0b7b811c6464ac8a0fe5230dc49aefc2f5507db8
Author: Dongjoon Hyun 
AuthorDate: Wed Oct 14 20:48:13 2020 -0700

[SPARK-33153][SQL][TESTS] Ignore Spark 2.4 in 
HiveExternalCatalogVersionsSuite on Python 3.8/3.9

### What changes were proposed in this pull request?

This PR aims to ignore Apache Spark 2.4.x distribution in 
HiveExternalCatalogVersionsSuite if Python version is 3.8 or 3.9.

### Why are the changes needed?

Currently, `HiveExternalCatalogVersionsSuite` is broken on the latest OS 
like `Ubuntu 20.04` because its default Python version is 3.8. PySpark 2.4.x 
doesn't work on Python 3.8 due to SPARK-29536.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Manually.
```
$ python3 --version
Python 3.8.5

$ build/sbt "hive/testOnly *.HiveExternalCatalogVersionsSuite"
...
[info] All tests passed.
[info] Passed: Total 1, Failed 0, Errors 0, Passed 1
```

Closes #30044 from dongjoon-hyun/SPARK-33153.

Authored-by: Dongjoon Hyun 
Signed-off-by: Dongjoon Hyun 
(cherry picked from commit ec34a001ad0ef57a496f29a6523d905128875b17)
Signed-off-by: Dongjoon Hyun 
---
 core/src/main/scala/org/apache/spark/TestUtils.scala| 13 +
 .../spark/sql/hive/HiveExternalCatalogVersionsSuite.scala   |  3 ++-
 2 files changed, 15 insertions(+), 1 deletion(-)

diff --git a/core/src/main/scala/org/apache/spark/TestUtils.scala 
b/core/src/main/scala/org/apache/spark/TestUtils.scala
index 1e00769..054e7b0 100644
--- a/core/src/main/scala/org/apache/spark/TestUtils.scala
+++ b/core/src/main/scala/org/apache/spark/TestUtils.scala
@@ -249,6 +249,19 @@ private[spark] object TestUtils {
 attempt.isSuccess && attempt.get == 0
   }
 
+  def isPythonVersionAtLeast38(): Boolean = {
+val attempt = if (Utils.isWindows) {
+  Try(Process(Seq("cmd.exe", "/C", "python3 --version"))
+.run(ProcessLogger(s => s.startsWith("Python 3.8") || 
s.startsWith("Python 3.9")))
+.exitValue())
+} else {
+  Try(Process(Seq("sh", "-c", "python3 --version"))
+.run(ProcessLogger(s => s.startsWith("Python 3.8") || 
s.startsWith("Python 3.9")))
+.exitValue())
+}
+attempt.isSuccess && attempt.get == 0
+  }
+
   /**
* Returns the response code from an HTTP(S) URL.
*/
diff --git 
a/sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveExternalCatalogVersionsSuite.scala
 
b/sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveExternalCatalogVersionsSuite.scala
index cbfdb7f..b81b7e8 100644
--- 
a/sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveExternalCatalogVersionsSuite.scala
+++ 
b/sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveExternalCatalogVersionsSuite.scala
@@ -234,7 +234,7 @@ object PROCESS_TABLES extends QueryTest with SQLTestUtils {
   // Tests the latest version of every release line.
   val testingVersions: Seq[String] = {
 import scala.io.Source
-try {
+val versions: Seq[String] = try {
   Source.fromURL(s"${releaseMirror}/spark").mkString
 .split("\n")
 .filter(_.contains(""" Seq("3.0.1", "2.4.7") // A temporary fallback to use 
a specific version
 }
+versions.filter(v => v.startsWith("3") || 
!TestUtils.isPythonVersionAtLeast38())
   }
 
   protected var spark: SparkSession = _


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (9ab0ec4 -> ec34a00)

2020-10-14 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 9ab0ec4  [SPARK-33146][CORE] Check for non-fatal errors when loading 
new applications in SHS
 add ec34a00  [SPARK-33153][SQL][TESTS] Ignore Spark 2.4 in 
HiveExternalCatalogVersionsSuite on Python 3.8/3.9

No new revisions were added by this update.

Summary of changes:
 core/src/main/scala/org/apache/spark/TestUtils.scala| 13 +
 .../spark/sql/hive/HiveExternalCatalogVersionsSuite.scala   |  3 ++-
 2 files changed, 15 insertions(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-33153][SQL][TESTS] Ignore Spark 2.4 in HiveExternalCatalogVersionsSuite on Python 3.8/3.9

2020-10-14 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 0b7b811  [SPARK-33153][SQL][TESTS] Ignore Spark 2.4 in 
HiveExternalCatalogVersionsSuite on Python 3.8/3.9
0b7b811 is described below

commit 0b7b811c6464ac8a0fe5230dc49aefc2f5507db8
Author: Dongjoon Hyun 
AuthorDate: Wed Oct 14 20:48:13 2020 -0700

[SPARK-33153][SQL][TESTS] Ignore Spark 2.4 in 
HiveExternalCatalogVersionsSuite on Python 3.8/3.9

### What changes were proposed in this pull request?

This PR aims to ignore Apache Spark 2.4.x distribution in 
HiveExternalCatalogVersionsSuite if Python version is 3.8 or 3.9.

### Why are the changes needed?

Currently, `HiveExternalCatalogVersionsSuite` is broken on the latest OS 
like `Ubuntu 20.04` because its default Python version is 3.8. PySpark 2.4.x 
doesn't work on Python 3.8 due to SPARK-29536.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Manually.
```
$ python3 --version
Python 3.8.5

$ build/sbt "hive/testOnly *.HiveExternalCatalogVersionsSuite"
...
[info] All tests passed.
[info] Passed: Total 1, Failed 0, Errors 0, Passed 1
```

Closes #30044 from dongjoon-hyun/SPARK-33153.

Authored-by: Dongjoon Hyun 
Signed-off-by: Dongjoon Hyun 
(cherry picked from commit ec34a001ad0ef57a496f29a6523d905128875b17)
Signed-off-by: Dongjoon Hyun 
---
 core/src/main/scala/org/apache/spark/TestUtils.scala| 13 +
 .../spark/sql/hive/HiveExternalCatalogVersionsSuite.scala   |  3 ++-
 2 files changed, 15 insertions(+), 1 deletion(-)

diff --git a/core/src/main/scala/org/apache/spark/TestUtils.scala 
b/core/src/main/scala/org/apache/spark/TestUtils.scala
index 1e00769..054e7b0 100644
--- a/core/src/main/scala/org/apache/spark/TestUtils.scala
+++ b/core/src/main/scala/org/apache/spark/TestUtils.scala
@@ -249,6 +249,19 @@ private[spark] object TestUtils {
 attempt.isSuccess && attempt.get == 0
   }
 
+  def isPythonVersionAtLeast38(): Boolean = {
+val attempt = if (Utils.isWindows) {
+  Try(Process(Seq("cmd.exe", "/C", "python3 --version"))
+.run(ProcessLogger(s => s.startsWith("Python 3.8") || 
s.startsWith("Python 3.9")))
+.exitValue())
+} else {
+  Try(Process(Seq("sh", "-c", "python3 --version"))
+.run(ProcessLogger(s => s.startsWith("Python 3.8") || 
s.startsWith("Python 3.9")))
+.exitValue())
+}
+attempt.isSuccess && attempt.get == 0
+  }
+
   /**
* Returns the response code from an HTTP(S) URL.
*/
diff --git 
a/sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveExternalCatalogVersionsSuite.scala
 
b/sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveExternalCatalogVersionsSuite.scala
index cbfdb7f..b81b7e8 100644
--- 
a/sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveExternalCatalogVersionsSuite.scala
+++ 
b/sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveExternalCatalogVersionsSuite.scala
@@ -234,7 +234,7 @@ object PROCESS_TABLES extends QueryTest with SQLTestUtils {
   // Tests the latest version of every release line.
   val testingVersions: Seq[String] = {
 import scala.io.Source
-try {
+val versions: Seq[String] = try {
   Source.fromURL(s"${releaseMirror}/spark").mkString
 .split("\n")
 .filter(_.contains(""" Seq("3.0.1", "2.4.7") // A temporary fallback to use 
a specific version
 }
+versions.filter(v => v.startsWith("3") || 
!TestUtils.isPythonVersionAtLeast38())
   }
 
   protected var spark: SparkSession = _


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (9ab0ec4 -> ec34a00)

2020-10-14 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 9ab0ec4  [SPARK-33146][CORE] Check for non-fatal errors when loading 
new applications in SHS
 add ec34a00  [SPARK-33153][SQL][TESTS] Ignore Spark 2.4 in 
HiveExternalCatalogVersionsSuite on Python 3.8/3.9

No new revisions were added by this update.

Summary of changes:
 core/src/main/scala/org/apache/spark/TestUtils.scala| 13 +
 .../spark/sql/hive/HiveExternalCatalogVersionsSuite.scala   |  3 ++-
 2 files changed, 15 insertions(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-33153][SQL][TESTS] Ignore Spark 2.4 in HiveExternalCatalogVersionsSuite on Python 3.8/3.9

2020-10-14 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 0b7b811  [SPARK-33153][SQL][TESTS] Ignore Spark 2.4 in 
HiveExternalCatalogVersionsSuite on Python 3.8/3.9
0b7b811 is described below

commit 0b7b811c6464ac8a0fe5230dc49aefc2f5507db8
Author: Dongjoon Hyun 
AuthorDate: Wed Oct 14 20:48:13 2020 -0700

[SPARK-33153][SQL][TESTS] Ignore Spark 2.4 in 
HiveExternalCatalogVersionsSuite on Python 3.8/3.9

### What changes were proposed in this pull request?

This PR aims to ignore Apache Spark 2.4.x distribution in 
HiveExternalCatalogVersionsSuite if Python version is 3.8 or 3.9.

### Why are the changes needed?

Currently, `HiveExternalCatalogVersionsSuite` is broken on the latest OS 
like `Ubuntu 20.04` because its default Python version is 3.8. PySpark 2.4.x 
doesn't work on Python 3.8 due to SPARK-29536.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Manually.
```
$ python3 --version
Python 3.8.5

$ build/sbt "hive/testOnly *.HiveExternalCatalogVersionsSuite"
...
[info] All tests passed.
[info] Passed: Total 1, Failed 0, Errors 0, Passed 1
```

Closes #30044 from dongjoon-hyun/SPARK-33153.

Authored-by: Dongjoon Hyun 
Signed-off-by: Dongjoon Hyun 
(cherry picked from commit ec34a001ad0ef57a496f29a6523d905128875b17)
Signed-off-by: Dongjoon Hyun 
---
 core/src/main/scala/org/apache/spark/TestUtils.scala| 13 +
 .../spark/sql/hive/HiveExternalCatalogVersionsSuite.scala   |  3 ++-
 2 files changed, 15 insertions(+), 1 deletion(-)

diff --git a/core/src/main/scala/org/apache/spark/TestUtils.scala 
b/core/src/main/scala/org/apache/spark/TestUtils.scala
index 1e00769..054e7b0 100644
--- a/core/src/main/scala/org/apache/spark/TestUtils.scala
+++ b/core/src/main/scala/org/apache/spark/TestUtils.scala
@@ -249,6 +249,19 @@ private[spark] object TestUtils {
 attempt.isSuccess && attempt.get == 0
   }
 
+  def isPythonVersionAtLeast38(): Boolean = {
+val attempt = if (Utils.isWindows) {
+  Try(Process(Seq("cmd.exe", "/C", "python3 --version"))
+.run(ProcessLogger(s => s.startsWith("Python 3.8") || 
s.startsWith("Python 3.9")))
+.exitValue())
+} else {
+  Try(Process(Seq("sh", "-c", "python3 --version"))
+.run(ProcessLogger(s => s.startsWith("Python 3.8") || 
s.startsWith("Python 3.9")))
+.exitValue())
+}
+attempt.isSuccess && attempt.get == 0
+  }
+
   /**
* Returns the response code from an HTTP(S) URL.
*/
diff --git 
a/sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveExternalCatalogVersionsSuite.scala
 
b/sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveExternalCatalogVersionsSuite.scala
index cbfdb7f..b81b7e8 100644
--- 
a/sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveExternalCatalogVersionsSuite.scala
+++ 
b/sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveExternalCatalogVersionsSuite.scala
@@ -234,7 +234,7 @@ object PROCESS_TABLES extends QueryTest with SQLTestUtils {
   // Tests the latest version of every release line.
   val testingVersions: Seq[String] = {
 import scala.io.Source
-try {
+val versions: Seq[String] = try {
   Source.fromURL(s"${releaseMirror}/spark").mkString
 .split("\n")
 .filter(_.contains(""" Seq("3.0.1", "2.4.7") // A temporary fallback to use 
a specific version
 }
+versions.filter(v => v.startsWith("3") || 
!TestUtils.isPythonVersionAtLeast38())
   }
 
   protected var spark: SparkSession = _


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (9ab0ec4 -> ec34a00)

2020-10-14 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 9ab0ec4  [SPARK-33146][CORE] Check for non-fatal errors when loading 
new applications in SHS
 add ec34a00  [SPARK-33153][SQL][TESTS] Ignore Spark 2.4 in 
HiveExternalCatalogVersionsSuite on Python 3.8/3.9

No new revisions were added by this update.

Summary of changes:
 core/src/main/scala/org/apache/spark/TestUtils.scala| 13 +
 .../spark/sql/hive/HiveExternalCatalogVersionsSuite.scala   |  3 ++-
 2 files changed, 15 insertions(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (9ab0ec4 -> ec34a00)

2020-10-14 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 9ab0ec4  [SPARK-33146][CORE] Check for non-fatal errors when loading 
new applications in SHS
 add ec34a00  [SPARK-33153][SQL][TESTS] Ignore Spark 2.4 in 
HiveExternalCatalogVersionsSuite on Python 3.8/3.9

No new revisions were added by this update.

Summary of changes:
 core/src/main/scala/org/apache/spark/TestUtils.scala| 13 +
 .../spark/sql/hive/HiveExternalCatalogVersionsSuite.scala   |  3 ++-
 2 files changed, 15 insertions(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-2.4 updated: [SPARK-33136][SQL][2.4] Fix mistakenly swapped parameter in V2WriteCommand.outputResolved

2020-10-14 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-2.4
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-2.4 by this push:
 new 816ffd7  [SPARK-33136][SQL][2.4] Fix mistakenly swapped parameter in 
V2WriteCommand.outputResolved
816ffd7 is described below

commit 816ffd7f10eaaaba334ca90554cda75beca8083f
Author: Jungtaek Lim (HeartSaVioR) 
AuthorDate: Wed Oct 14 20:09:37 2020 -0700

[SPARK-33136][SQL][2.4] Fix mistakenly swapped parameter in 
V2WriteCommand.outputResolved

### What changes were proposed in this pull request?

This PR proposes to fix a bug on calling 
`DataType.equalsIgnoreCompatibleNullability` with mistakenly swapped parameters 
in `AppendData.outputResolved`. The order of parameters for 
`DataType.equalsIgnoreCompatibleNullability` are `from` and `to`, which says 
that the right order of matching variables are `inAttr` and `outAttr`.

### Why are the changes needed?

Although the problematic part is a dead code, once we know there's a bug, 
preferably we'd like to fix that.

### Does this PR introduce _any_ user-facing change?

No, as this fixes the dead code.

### How was this patch tested?

This patch fixes the dead code, which is not easy to craft a test. (The 
test in original commit is no longer valid here.)

Closes #30043 from HeartSaVioR/SPARK-33136-branch-2.4.

Authored-by: Jungtaek Lim (HeartSaVioR) 
Signed-off-by: Dongjoon Hyun 
---
 .../apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala
index 1c25586..a0086c1 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala
@@ -378,7 +378,7 @@ case class AppendData(
   case (inAttr, outAttr) =>
 // names and types must match, nullability must be compatible
 inAttr.name == outAttr.name &&
-DataType.equalsIgnoreCompatibleNullability(outAttr.dataType, 
inAttr.dataType) &&
+DataType.equalsIgnoreCompatibleNullability(inAttr.dataType, 
outAttr.dataType) &&
 (outAttr.nullable || !inAttr.nullable)
 }
   }


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-2.4 updated: [SPARK-33136][SQL][2.4] Fix mistakenly swapped parameter in V2WriteCommand.outputResolved

2020-10-14 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-2.4
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-2.4 by this push:
 new 816ffd7  [SPARK-33136][SQL][2.4] Fix mistakenly swapped parameter in 
V2WriteCommand.outputResolved
816ffd7 is described below

commit 816ffd7f10eaaaba334ca90554cda75beca8083f
Author: Jungtaek Lim (HeartSaVioR) 
AuthorDate: Wed Oct 14 20:09:37 2020 -0700

[SPARK-33136][SQL][2.4] Fix mistakenly swapped parameter in 
V2WriteCommand.outputResolved

### What changes were proposed in this pull request?

This PR proposes to fix a bug on calling 
`DataType.equalsIgnoreCompatibleNullability` with mistakenly swapped parameters 
in `AppendData.outputResolved`. The order of parameters for 
`DataType.equalsIgnoreCompatibleNullability` are `from` and `to`, which says 
that the right order of matching variables are `inAttr` and `outAttr`.

### Why are the changes needed?

Although the problematic part is a dead code, once we know there's a bug, 
preferably we'd like to fix that.

### Does this PR introduce _any_ user-facing change?

No, as this fixes the dead code.

### How was this patch tested?

This patch fixes the dead code, which is not easy to craft a test. (The 
test in original commit is no longer valid here.)

Closes #30043 from HeartSaVioR/SPARK-33136-branch-2.4.

Authored-by: Jungtaek Lim (HeartSaVioR) 
Signed-off-by: Dongjoon Hyun 
---
 .../apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala
index 1c25586..a0086c1 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala
@@ -378,7 +378,7 @@ case class AppendData(
   case (inAttr, outAttr) =>
 // names and types must match, nullability must be compatible
 inAttr.name == outAttr.name &&
-DataType.equalsIgnoreCompatibleNullability(outAttr.dataType, 
inAttr.dataType) &&
+DataType.equalsIgnoreCompatibleNullability(inAttr.dataType, 
outAttr.dataType) &&
 (outAttr.nullable || !inAttr.nullable)
 }
   }


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-33146][CORE] Check for non-fatal errors when loading new applications in SHS

2020-10-14 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new d9669bd  [SPARK-33146][CORE] Check for non-fatal errors when loading 
new applications in SHS
d9669bd is described below

commit d9669bdf0ff4ed9951d7077b8dc9ad94507615c5
Author: Adam Binford 
AuthorDate: Thu Oct 15 11:59:29 2020 +0900

[SPARK-33146][CORE] Check for non-fatal errors when loading new 
applications in SHS

### What changes were proposed in this pull request?

Adds an additional check for non-fatal errors when attempting to add a new 
entry to the history server application listing.

### Why are the changes needed?

A bad rolling event log folder (missing appstatus file or no log files) 
would cause no applications to be loaded by the Spark history server. Figuring 
out why invalid event log folders are created in the first place will be 
addressed in separate issues, this just lets the history server skip the 
invalid folder and successfully load all the valid applications.

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

New UT

Closes #30037 from Kimahriman/bug/rolling-log-crashing-history.

Authored-by: Adam Binford 
Signed-off-by: Jungtaek Lim (HeartSaVioR) 
(cherry picked from commit 9ab0ec4e38e5df0537b38cb0f89e004ad57bec90)
Signed-off-by: Jungtaek Lim (HeartSaVioR) 
---
 .../spark/deploy/history/FsHistoryProvider.scala   |  3 ++
 .../deploy/history/FsHistoryProviderSuite.scala| 49 ++
 2 files changed, 52 insertions(+)

diff --git 
a/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala 
b/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala
index c262152..5970708 100644
--- 
a/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala
+++ 
b/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala
@@ -526,6 +526,9 @@ private[history] class FsHistoryProvider(conf: SparkConf, 
clock: Clock)
 reader.fileSizeForLastIndex > 0
   } catch {
 case _: FileNotFoundException => false
+case NonFatal(e) =>
+  logWarning(s"Error while reading new log 
${reader.rootPath}", e)
+  false
   }
 
 case _: FileNotFoundException =>
diff --git 
a/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala
 
b/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala
index c2f34fc..f3beb35 100644
--- 
a/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala
+++ 
b/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala
@@ -1470,6 +1470,55 @@ class FsHistoryProviderSuite extends SparkFunSuite with 
Matchers with Logging {
 }
   }
 
+  test("SPARK-33146: don't let one bad rolling log folder prevent loading 
other applications") {
+withTempDir { dir =>
+  val conf = createTestConf(true)
+  conf.set(HISTORY_LOG_DIR, dir.getAbsolutePath)
+  val hadoopConf = SparkHadoopUtil.newConfiguration(conf)
+  val fs = new Path(dir.getAbsolutePath).getFileSystem(hadoopConf)
+
+  val provider = new FsHistoryProvider(conf)
+
+  val writer = new RollingEventLogFilesWriter("app", None, dir.toURI, 
conf, hadoopConf)
+  writer.start()
+
+  writeEventsToRollingWriter(writer, Seq(
+SparkListenerApplicationStart("app", Some("app"), 0, "user", None),
+SparkListenerJobStart(1, 0, Seq.empty)), rollFile = false)
+  provider.checkForLogs()
+  provider.cleanLogs()
+  assert(dir.listFiles().size === 1)
+  assert(provider.getListing.length === 1)
+
+  // Manually delete the appstatus file to make an invalid rolling event 
log
+  val appStatusPath = RollingEventLogFilesWriter.getAppStatusFilePath(new 
Path(writer.logPath),
+"app", None, true)
+  fs.delete(appStatusPath, false)
+  provider.checkForLogs()
+  provider.cleanLogs()
+  assert(provider.getListing.length === 0)
+
+  // Create a new application
+  val writer2 = new RollingEventLogFilesWriter("app2", None, dir.toURI, 
conf, hadoopConf)
+  writer2.start()
+  writeEventsToRollingWriter(writer2, Seq(
+SparkListenerApplicationStart("app2", Some("app2"), 0, "user", None),
+SparkListenerJobStart(1, 0, Seq.empty)), rollFile = false)
+
+  // Both folders exist but only one application found
+  provider.checkForLogs()
+  provider.cleanLogs()
+  assert(provider.getListing.length === 1)
+  assert(dir.listFiles().size === 2)
+
+  // Make sure a new provider sees the valid application
+  provider.stop()
+  val newProvider = new

[spark] branch branch-2.4 updated: [SPARK-33136][SQL][2.4] Fix mistakenly swapped parameter in V2WriteCommand.outputResolved

2020-10-14 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-2.4
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-2.4 by this push:
 new 816ffd7  [SPARK-33136][SQL][2.4] Fix mistakenly swapped parameter in 
V2WriteCommand.outputResolved
816ffd7 is described below

commit 816ffd7f10eaaaba334ca90554cda75beca8083f
Author: Jungtaek Lim (HeartSaVioR) 
AuthorDate: Wed Oct 14 20:09:37 2020 -0700

[SPARK-33136][SQL][2.4] Fix mistakenly swapped parameter in 
V2WriteCommand.outputResolved

### What changes were proposed in this pull request?

This PR proposes to fix a bug on calling 
`DataType.equalsIgnoreCompatibleNullability` with mistakenly swapped parameters 
in `AppendData.outputResolved`. The order of parameters for 
`DataType.equalsIgnoreCompatibleNullability` are `from` and `to`, which says 
that the right order of matching variables are `inAttr` and `outAttr`.

### Why are the changes needed?

Although the problematic part is a dead code, once we know there's a bug, 
preferably we'd like to fix that.

### Does this PR introduce _any_ user-facing change?

No, as this fixes the dead code.

### How was this patch tested?

This patch fixes the dead code, which is not easy to craft a test. (The 
test in original commit is no longer valid here.)

Closes #30043 from HeartSaVioR/SPARK-33136-branch-2.4.

Authored-by: Jungtaek Lim (HeartSaVioR) 
Signed-off-by: Dongjoon Hyun 
---
 .../apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala
index 1c25586..a0086c1 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala
@@ -378,7 +378,7 @@ case class AppendData(
   case (inAttr, outAttr) =>
 // names and types must match, nullability must be compatible
 inAttr.name == outAttr.name &&
-DataType.equalsIgnoreCompatibleNullability(outAttr.dataType, 
inAttr.dataType) &&
+DataType.equalsIgnoreCompatibleNullability(inAttr.dataType, 
outAttr.dataType) &&
 (outAttr.nullable || !inAttr.nullable)
 }
   }


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-33146][CORE] Check for non-fatal errors when loading new applications in SHS

2020-10-14 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new d9669bd  [SPARK-33146][CORE] Check for non-fatal errors when loading 
new applications in SHS
d9669bd is described below

commit d9669bdf0ff4ed9951d7077b8dc9ad94507615c5
Author: Adam Binford 
AuthorDate: Thu Oct 15 11:59:29 2020 +0900

[SPARK-33146][CORE] Check for non-fatal errors when loading new 
applications in SHS

### What changes were proposed in this pull request?

Adds an additional check for non-fatal errors when attempting to add a new 
entry to the history server application listing.

### Why are the changes needed?

A bad rolling event log folder (missing appstatus file or no log files) 
would cause no applications to be loaded by the Spark history server. Figuring 
out why invalid event log folders are created in the first place will be 
addressed in separate issues, this just lets the history server skip the 
invalid folder and successfully load all the valid applications.

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

New UT

Closes #30037 from Kimahriman/bug/rolling-log-crashing-history.

Authored-by: Adam Binford 
Signed-off-by: Jungtaek Lim (HeartSaVioR) 
(cherry picked from commit 9ab0ec4e38e5df0537b38cb0f89e004ad57bec90)
Signed-off-by: Jungtaek Lim (HeartSaVioR) 
---
 .../spark/deploy/history/FsHistoryProvider.scala   |  3 ++
 .../deploy/history/FsHistoryProviderSuite.scala| 49 ++
 2 files changed, 52 insertions(+)

diff --git 
a/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala 
b/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala
index c262152..5970708 100644
--- 
a/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala
+++ 
b/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala
@@ -526,6 +526,9 @@ private[history] class FsHistoryProvider(conf: SparkConf, 
clock: Clock)
 reader.fileSizeForLastIndex > 0
   } catch {
 case _: FileNotFoundException => false
+case NonFatal(e) =>
+  logWarning(s"Error while reading new log 
${reader.rootPath}", e)
+  false
   }
 
 case _: FileNotFoundException =>
diff --git 
a/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala
 
b/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala
index c2f34fc..f3beb35 100644
--- 
a/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala
+++ 
b/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala
@@ -1470,6 +1470,55 @@ class FsHistoryProviderSuite extends SparkFunSuite with 
Matchers with Logging {
 }
   }
 
+  test("SPARK-33146: don't let one bad rolling log folder prevent loading 
other applications") {
+withTempDir { dir =>
+  val conf = createTestConf(true)
+  conf.set(HISTORY_LOG_DIR, dir.getAbsolutePath)
+  val hadoopConf = SparkHadoopUtil.newConfiguration(conf)
+  val fs = new Path(dir.getAbsolutePath).getFileSystem(hadoopConf)
+
+  val provider = new FsHistoryProvider(conf)
+
+  val writer = new RollingEventLogFilesWriter("app", None, dir.toURI, 
conf, hadoopConf)
+  writer.start()
+
+  writeEventsToRollingWriter(writer, Seq(
+SparkListenerApplicationStart("app", Some("app"), 0, "user", None),
+SparkListenerJobStart(1, 0, Seq.empty)), rollFile = false)
+  provider.checkForLogs()
+  provider.cleanLogs()
+  assert(dir.listFiles().size === 1)
+  assert(provider.getListing.length === 1)
+
+  // Manually delete the appstatus file to make an invalid rolling event 
log
+  val appStatusPath = RollingEventLogFilesWriter.getAppStatusFilePath(new 
Path(writer.logPath),
+"app", None, true)
+  fs.delete(appStatusPath, false)
+  provider.checkForLogs()
+  provider.cleanLogs()
+  assert(provider.getListing.length === 0)
+
+  // Create a new application
+  val writer2 = new RollingEventLogFilesWriter("app2", None, dir.toURI, 
conf, hadoopConf)
+  writer2.start()
+  writeEventsToRollingWriter(writer2, Seq(
+SparkListenerApplicationStart("app2", Some("app2"), 0, "user", None),
+SparkListenerJobStart(1, 0, Seq.empty)), rollFile = false)
+
+  // Both folders exist but only one application found
+  provider.checkForLogs()
+  provider.cleanLogs()
+  assert(provider.getListing.length === 1)
+  assert(dir.listFiles().size === 2)
+
+  // Make sure a new provider sees the valid application
+  provider.stop()
+  val newProvider = new

[spark] branch master updated (f3ad32f -> 9ab0ec4)

2020-10-14 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from f3ad32f  [SPARK-33026][SQL][FOLLOWUP] metrics name should be 
numOutputRows
 add 9ab0ec4  [SPARK-33146][CORE] Check for non-fatal errors when loading 
new applications in SHS

No new revisions were added by this update.

Summary of changes:
 .../spark/deploy/history/FsHistoryProvider.scala   |  3 ++
 .../deploy/history/FsHistoryProviderSuite.scala| 49 ++
 2 files changed, 52 insertions(+)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-2.4 updated: [SPARK-33136][SQL][2.4] Fix mistakenly swapped parameter in V2WriteCommand.outputResolved

2020-10-14 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-2.4
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-2.4 by this push:
 new 816ffd7  [SPARK-33136][SQL][2.4] Fix mistakenly swapped parameter in 
V2WriteCommand.outputResolved
816ffd7 is described below

commit 816ffd7f10eaaaba334ca90554cda75beca8083f
Author: Jungtaek Lim (HeartSaVioR) 
AuthorDate: Wed Oct 14 20:09:37 2020 -0700

[SPARK-33136][SQL][2.4] Fix mistakenly swapped parameter in 
V2WriteCommand.outputResolved

### What changes were proposed in this pull request?

This PR proposes to fix a bug on calling 
`DataType.equalsIgnoreCompatibleNullability` with mistakenly swapped parameters 
in `AppendData.outputResolved`. The order of parameters for 
`DataType.equalsIgnoreCompatibleNullability` are `from` and `to`, which says 
that the right order of matching variables are `inAttr` and `outAttr`.

### Why are the changes needed?

Although the problematic part is a dead code, once we know there's a bug, 
preferably we'd like to fix that.

### Does this PR introduce _any_ user-facing change?

No, as this fixes the dead code.

### How was this patch tested?

This patch fixes the dead code, which is not easy to craft a test. (The 
test in original commit is no longer valid here.)

Closes #30043 from HeartSaVioR/SPARK-33136-branch-2.4.

Authored-by: Jungtaek Lim (HeartSaVioR) 
Signed-off-by: Dongjoon Hyun 
---
 .../apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala
index 1c25586..a0086c1 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala
@@ -378,7 +378,7 @@ case class AppendData(
   case (inAttr, outAttr) =>
 // names and types must match, nullability must be compatible
 inAttr.name == outAttr.name &&
-DataType.equalsIgnoreCompatibleNullability(outAttr.dataType, 
inAttr.dataType) &&
+DataType.equalsIgnoreCompatibleNullability(inAttr.dataType, 
outAttr.dataType) &&
 (outAttr.nullable || !inAttr.nullable)
 }
   }


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-33146][CORE] Check for non-fatal errors when loading new applications in SHS

2020-10-14 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new d9669bd  [SPARK-33146][CORE] Check for non-fatal errors when loading 
new applications in SHS
d9669bd is described below

commit d9669bdf0ff4ed9951d7077b8dc9ad94507615c5
Author: Adam Binford 
AuthorDate: Thu Oct 15 11:59:29 2020 +0900

[SPARK-33146][CORE] Check for non-fatal errors when loading new 
applications in SHS

### What changes were proposed in this pull request?

Adds an additional check for non-fatal errors when attempting to add a new 
entry to the history server application listing.

### Why are the changes needed?

A bad rolling event log folder (missing appstatus file or no log files) 
would cause no applications to be loaded by the Spark history server. Figuring 
out why invalid event log folders are created in the first place will be 
addressed in separate issues, this just lets the history server skip the 
invalid folder and successfully load all the valid applications.

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

New UT

Closes #30037 from Kimahriman/bug/rolling-log-crashing-history.

Authored-by: Adam Binford 
Signed-off-by: Jungtaek Lim (HeartSaVioR) 
(cherry picked from commit 9ab0ec4e38e5df0537b38cb0f89e004ad57bec90)
Signed-off-by: Jungtaek Lim (HeartSaVioR) 
---
 .../spark/deploy/history/FsHistoryProvider.scala   |  3 ++
 .../deploy/history/FsHistoryProviderSuite.scala| 49 ++
 2 files changed, 52 insertions(+)

diff --git 
a/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala 
b/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala
index c262152..5970708 100644
--- 
a/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala
+++ 
b/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala
@@ -526,6 +526,9 @@ private[history] class FsHistoryProvider(conf: SparkConf, 
clock: Clock)
 reader.fileSizeForLastIndex > 0
   } catch {
 case _: FileNotFoundException => false
+case NonFatal(e) =>
+  logWarning(s"Error while reading new log 
${reader.rootPath}", e)
+  false
   }
 
 case _: FileNotFoundException =>
diff --git 
a/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala
 
b/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala
index c2f34fc..f3beb35 100644
--- 
a/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala
+++ 
b/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala
@@ -1470,6 +1470,55 @@ class FsHistoryProviderSuite extends SparkFunSuite with 
Matchers with Logging {
 }
   }
 
+  test("SPARK-33146: don't let one bad rolling log folder prevent loading 
other applications") {
+withTempDir { dir =>
+  val conf = createTestConf(true)
+  conf.set(HISTORY_LOG_DIR, dir.getAbsolutePath)
+  val hadoopConf = SparkHadoopUtil.newConfiguration(conf)
+  val fs = new Path(dir.getAbsolutePath).getFileSystem(hadoopConf)
+
+  val provider = new FsHistoryProvider(conf)
+
+  val writer = new RollingEventLogFilesWriter("app", None, dir.toURI, 
conf, hadoopConf)
+  writer.start()
+
+  writeEventsToRollingWriter(writer, Seq(
+SparkListenerApplicationStart("app", Some("app"), 0, "user", None),
+SparkListenerJobStart(1, 0, Seq.empty)), rollFile = false)
+  provider.checkForLogs()
+  provider.cleanLogs()
+  assert(dir.listFiles().size === 1)
+  assert(provider.getListing.length === 1)
+
+  // Manually delete the appstatus file to make an invalid rolling event 
log
+  val appStatusPath = RollingEventLogFilesWriter.getAppStatusFilePath(new 
Path(writer.logPath),
+"app", None, true)
+  fs.delete(appStatusPath, false)
+  provider.checkForLogs()
+  provider.cleanLogs()
+  assert(provider.getListing.length === 0)
+
+  // Create a new application
+  val writer2 = new RollingEventLogFilesWriter("app2", None, dir.toURI, 
conf, hadoopConf)
+  writer2.start()
+  writeEventsToRollingWriter(writer2, Seq(
+SparkListenerApplicationStart("app2", Some("app2"), 0, "user", None),
+SparkListenerJobStart(1, 0, Seq.empty)), rollFile = false)
+
+  // Both folders exist but only one application found
+  provider.checkForLogs()
+  provider.cleanLogs()
+  assert(provider.getListing.length === 1)
+  assert(dir.listFiles().size === 2)
+
+  // Make sure a new provider sees the valid application
+  provider.stop()
+  val newProvider = new

[spark] branch master updated (f3ad32f -> 9ab0ec4)

2020-10-14 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from f3ad32f  [SPARK-33026][SQL][FOLLOWUP] metrics name should be 
numOutputRows
 add 9ab0ec4  [SPARK-33146][CORE] Check for non-fatal errors when loading 
new applications in SHS

No new revisions were added by this update.

Summary of changes:
 .../spark/deploy/history/FsHistoryProvider.scala   |  3 ++
 .../deploy/history/FsHistoryProviderSuite.scala| 49 ++
 2 files changed, 52 insertions(+)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-2.4 updated: [SPARK-33136][SQL][2.4] Fix mistakenly swapped parameter in V2WriteCommand.outputResolved

2020-10-14 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-2.4
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-2.4 by this push:
 new 816ffd7  [SPARK-33136][SQL][2.4] Fix mistakenly swapped parameter in 
V2WriteCommand.outputResolved
816ffd7 is described below

commit 816ffd7f10eaaaba334ca90554cda75beca8083f
Author: Jungtaek Lim (HeartSaVioR) 
AuthorDate: Wed Oct 14 20:09:37 2020 -0700

[SPARK-33136][SQL][2.4] Fix mistakenly swapped parameter in 
V2WriteCommand.outputResolved

### What changes were proposed in this pull request?

This PR proposes to fix a bug on calling 
`DataType.equalsIgnoreCompatibleNullability` with mistakenly swapped parameters 
in `AppendData.outputResolved`. The order of parameters for 
`DataType.equalsIgnoreCompatibleNullability` are `from` and `to`, which says 
that the right order of matching variables are `inAttr` and `outAttr`.

### Why are the changes needed?

Although the problematic part is a dead code, once we know there's a bug, 
preferably we'd like to fix that.

### Does this PR introduce _any_ user-facing change?

No, as this fixes the dead code.

### How was this patch tested?

This patch fixes the dead code, which is not easy to craft a test. (The 
test in original commit is no longer valid here.)

Closes #30043 from HeartSaVioR/SPARK-33136-branch-2.4.

Authored-by: Jungtaek Lim (HeartSaVioR) 
Signed-off-by: Dongjoon Hyun 
---
 .../apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala
index 1c25586..a0086c1 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala
@@ -378,7 +378,7 @@ case class AppendData(
   case (inAttr, outAttr) =>
 // names and types must match, nullability must be compatible
 inAttr.name == outAttr.name &&
-DataType.equalsIgnoreCompatibleNullability(outAttr.dataType, 
inAttr.dataType) &&
+DataType.equalsIgnoreCompatibleNullability(inAttr.dataType, 
outAttr.dataType) &&
 (outAttr.nullable || !inAttr.nullable)
 }
   }


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-33146][CORE] Check for non-fatal errors when loading new applications in SHS

2020-10-14 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new d9669bd  [SPARK-33146][CORE] Check for non-fatal errors when loading 
new applications in SHS
d9669bd is described below

commit d9669bdf0ff4ed9951d7077b8dc9ad94507615c5
Author: Adam Binford 
AuthorDate: Thu Oct 15 11:59:29 2020 +0900

[SPARK-33146][CORE] Check for non-fatal errors when loading new 
applications in SHS

### What changes were proposed in this pull request?

Adds an additional check for non-fatal errors when attempting to add a new 
entry to the history server application listing.

### Why are the changes needed?

A bad rolling event log folder (missing appstatus file or no log files) 
would cause no applications to be loaded by the Spark history server. Figuring 
out why invalid event log folders are created in the first place will be 
addressed in separate issues, this just lets the history server skip the 
invalid folder and successfully load all the valid applications.

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

New UT

Closes #30037 from Kimahriman/bug/rolling-log-crashing-history.

Authored-by: Adam Binford 
Signed-off-by: Jungtaek Lim (HeartSaVioR) 
(cherry picked from commit 9ab0ec4e38e5df0537b38cb0f89e004ad57bec90)
Signed-off-by: Jungtaek Lim (HeartSaVioR) 
---
 .../spark/deploy/history/FsHistoryProvider.scala   |  3 ++
 .../deploy/history/FsHistoryProviderSuite.scala| 49 ++
 2 files changed, 52 insertions(+)

diff --git 
a/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala 
b/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala
index c262152..5970708 100644
--- 
a/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala
+++ 
b/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala
@@ -526,6 +526,9 @@ private[history] class FsHistoryProvider(conf: SparkConf, 
clock: Clock)
 reader.fileSizeForLastIndex > 0
   } catch {
 case _: FileNotFoundException => false
+case NonFatal(e) =>
+  logWarning(s"Error while reading new log 
${reader.rootPath}", e)
+  false
   }
 
 case _: FileNotFoundException =>
diff --git 
a/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala
 
b/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala
index c2f34fc..f3beb35 100644
--- 
a/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala
+++ 
b/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala
@@ -1470,6 +1470,55 @@ class FsHistoryProviderSuite extends SparkFunSuite with 
Matchers with Logging {
 }
   }
 
+  test("SPARK-33146: don't let one bad rolling log folder prevent loading 
other applications") {
+withTempDir { dir =>
+  val conf = createTestConf(true)
+  conf.set(HISTORY_LOG_DIR, dir.getAbsolutePath)
+  val hadoopConf = SparkHadoopUtil.newConfiguration(conf)
+  val fs = new Path(dir.getAbsolutePath).getFileSystem(hadoopConf)
+
+  val provider = new FsHistoryProvider(conf)
+
+  val writer = new RollingEventLogFilesWriter("app", None, dir.toURI, 
conf, hadoopConf)
+  writer.start()
+
+  writeEventsToRollingWriter(writer, Seq(
+SparkListenerApplicationStart("app", Some("app"), 0, "user", None),
+SparkListenerJobStart(1, 0, Seq.empty)), rollFile = false)
+  provider.checkForLogs()
+  provider.cleanLogs()
+  assert(dir.listFiles().size === 1)
+  assert(provider.getListing.length === 1)
+
+  // Manually delete the appstatus file to make an invalid rolling event 
log
+  val appStatusPath = RollingEventLogFilesWriter.getAppStatusFilePath(new 
Path(writer.logPath),
+"app", None, true)
+  fs.delete(appStatusPath, false)
+  provider.checkForLogs()
+  provider.cleanLogs()
+  assert(provider.getListing.length === 0)
+
+  // Create a new application
+  val writer2 = new RollingEventLogFilesWriter("app2", None, dir.toURI, 
conf, hadoopConf)
+  writer2.start()
+  writeEventsToRollingWriter(writer2, Seq(
+SparkListenerApplicationStart("app2", Some("app2"), 0, "user", None),
+SparkListenerJobStart(1, 0, Seq.empty)), rollFile = false)
+
+  // Both folders exist but only one application found
+  provider.checkForLogs()
+  provider.cleanLogs()
+  assert(provider.getListing.length === 1)
+  assert(dir.listFiles().size === 2)
+
+  // Make sure a new provider sees the valid application
+  provider.stop()
+  val newProvider = new

[spark] branch master updated (f3ad32f -> 9ab0ec4)

2020-10-14 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from f3ad32f  [SPARK-33026][SQL][FOLLOWUP] metrics name should be 
numOutputRows
 add 9ab0ec4  [SPARK-33146][CORE] Check for non-fatal errors when loading 
new applications in SHS

No new revisions were added by this update.

Summary of changes:
 .../spark/deploy/history/FsHistoryProvider.scala   |  3 ++
 .../deploy/history/FsHistoryProviderSuite.scala| 49 ++
 2 files changed, 52 insertions(+)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-33146][CORE] Check for non-fatal errors when loading new applications in SHS

2020-10-14 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new d9669bd  [SPARK-33146][CORE] Check for non-fatal errors when loading 
new applications in SHS
d9669bd is described below

commit d9669bdf0ff4ed9951d7077b8dc9ad94507615c5
Author: Adam Binford 
AuthorDate: Thu Oct 15 11:59:29 2020 +0900

[SPARK-33146][CORE] Check for non-fatal errors when loading new 
applications in SHS

### What changes were proposed in this pull request?

Adds an additional check for non-fatal errors when attempting to add a new 
entry to the history server application listing.

### Why are the changes needed?

A bad rolling event log folder (missing appstatus file or no log files) 
would cause no applications to be loaded by the Spark history server. Figuring 
out why invalid event log folders are created in the first place will be 
addressed in separate issues, this just lets the history server skip the 
invalid folder and successfully load all the valid applications.

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

New UT

Closes #30037 from Kimahriman/bug/rolling-log-crashing-history.

Authored-by: Adam Binford 
Signed-off-by: Jungtaek Lim (HeartSaVioR) 
(cherry picked from commit 9ab0ec4e38e5df0537b38cb0f89e004ad57bec90)
Signed-off-by: Jungtaek Lim (HeartSaVioR) 
---
 .../spark/deploy/history/FsHistoryProvider.scala   |  3 ++
 .../deploy/history/FsHistoryProviderSuite.scala| 49 ++
 2 files changed, 52 insertions(+)

diff --git 
a/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala 
b/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala
index c262152..5970708 100644
--- 
a/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala
+++ 
b/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala
@@ -526,6 +526,9 @@ private[history] class FsHistoryProvider(conf: SparkConf, 
clock: Clock)
 reader.fileSizeForLastIndex > 0
   } catch {
 case _: FileNotFoundException => false
+case NonFatal(e) =>
+  logWarning(s"Error while reading new log 
${reader.rootPath}", e)
+  false
   }
 
 case _: FileNotFoundException =>
diff --git 
a/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala
 
b/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala
index c2f34fc..f3beb35 100644
--- 
a/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala
+++ 
b/core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala
@@ -1470,6 +1470,55 @@ class FsHistoryProviderSuite extends SparkFunSuite with 
Matchers with Logging {
 }
   }
 
+  test("SPARK-33146: don't let one bad rolling log folder prevent loading 
other applications") {
+withTempDir { dir =>
+  val conf = createTestConf(true)
+  conf.set(HISTORY_LOG_DIR, dir.getAbsolutePath)
+  val hadoopConf = SparkHadoopUtil.newConfiguration(conf)
+  val fs = new Path(dir.getAbsolutePath).getFileSystem(hadoopConf)
+
+  val provider = new FsHistoryProvider(conf)
+
+  val writer = new RollingEventLogFilesWriter("app", None, dir.toURI, 
conf, hadoopConf)
+  writer.start()
+
+  writeEventsToRollingWriter(writer, Seq(
+SparkListenerApplicationStart("app", Some("app"), 0, "user", None),
+SparkListenerJobStart(1, 0, Seq.empty)), rollFile = false)
+  provider.checkForLogs()
+  provider.cleanLogs()
+  assert(dir.listFiles().size === 1)
+  assert(provider.getListing.length === 1)
+
+  // Manually delete the appstatus file to make an invalid rolling event 
log
+  val appStatusPath = RollingEventLogFilesWriter.getAppStatusFilePath(new 
Path(writer.logPath),
+"app", None, true)
+  fs.delete(appStatusPath, false)
+  provider.checkForLogs()
+  provider.cleanLogs()
+  assert(provider.getListing.length === 0)
+
+  // Create a new application
+  val writer2 = new RollingEventLogFilesWriter("app2", None, dir.toURI, 
conf, hadoopConf)
+  writer2.start()
+  writeEventsToRollingWriter(writer2, Seq(
+SparkListenerApplicationStart("app2", Some("app2"), 0, "user", None),
+SparkListenerJobStart(1, 0, Seq.empty)), rollFile = false)
+
+  // Both folders exist but only one application found
+  provider.checkForLogs()
+  provider.cleanLogs()
+  assert(provider.getListing.length === 1)
+  assert(dir.listFiles().size === 2)
+
+  // Make sure a new provider sees the valid application
+  provider.stop()
+  val newProvider = new

[spark] branch master updated (f3ad32f -> 9ab0ec4)

2020-10-14 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from f3ad32f  [SPARK-33026][SQL][FOLLOWUP] metrics name should be 
numOutputRows
 add 9ab0ec4  [SPARK-33146][CORE] Check for non-fatal errors when loading 
new applications in SHS

No new revisions were added by this update.

Summary of changes:
 .../spark/deploy/history/FsHistoryProvider.scala   |  3 ++
 .../deploy/history/FsHistoryProviderSuite.scala| 49 ++
 2 files changed, 52 insertions(+)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (f3ad32f -> 9ab0ec4)

2020-10-14 Thread kabhwan

This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from f3ad32f  [SPARK-33026][SQL][FOLLOWUP] metrics name should be 
numOutputRows
 add 9ab0ec4  [SPARK-33146][CORE] Check for non-fatal errors when loading 
new applications in SHS

No new revisions were added by this update.

Summary of changes:
 .../spark/deploy/history/FsHistoryProvider.scala   |  3 ++
 .../deploy/history/FsHistoryProviderSuite.scala| 49 ++
 2 files changed, 52 insertions(+)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (8e5cb1d -> f3ad32f)

2020-10-14 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 8e5cb1d  [SPARK-33136][SQL] Fix mistakenly swapped parameter in 
V2WriteCommand.outputResolved
 add f3ad32f  [SPARK-33026][SQL][FOLLOWUP] metrics name should be 
numOutputRows

No new revisions were added by this update.

Summary of changes:
 .../spark/sql/execution/exchange/BroadcastExchangeExec.scala  | 8 
 .../org/apache/spark/sql/execution/metric/SQLMetricsSuite.scala   | 2 +-
 2 files changed, 5 insertions(+), 5 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (8e5cb1d -> f3ad32f)

2020-10-14 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 8e5cb1d  [SPARK-33136][SQL] Fix mistakenly swapped parameter in 
V2WriteCommand.outputResolved
 add f3ad32f  [SPARK-33026][SQL][FOLLOWUP] metrics name should be 
numOutputRows

No new revisions were added by this update.

Summary of changes:
 .../spark/sql/execution/exchange/BroadcastExchangeExec.scala  | 8 
 .../org/apache/spark/sql/execution/metric/SQLMetricsSuite.scala   | 2 +-
 2 files changed, 5 insertions(+), 5 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (8e5cb1d -> f3ad32f)

2020-10-14 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 8e5cb1d  [SPARK-33136][SQL] Fix mistakenly swapped parameter in 
V2WriteCommand.outputResolved
 add f3ad32f  [SPARK-33026][SQL][FOLLOWUP] metrics name should be 
numOutputRows

No new revisions were added by this update.

Summary of changes:
 .../spark/sql/execution/exchange/BroadcastExchangeExec.scala  | 8 
 .../org/apache/spark/sql/execution/metric/SQLMetricsSuite.scala   | 2 +-
 2 files changed, 5 insertions(+), 5 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (8e5cb1d -> f3ad32f)

2020-10-14 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 8e5cb1d  [SPARK-33136][SQL] Fix mistakenly swapped parameter in 
V2WriteCommand.outputResolved
 add f3ad32f  [SPARK-33026][SQL][FOLLOWUP] metrics name should be 
numOutputRows

No new revisions were added by this update.

Summary of changes:
 .../spark/sql/execution/exchange/BroadcastExchangeExec.scala  | 8 
 .../org/apache/spark/sql/execution/metric/SQLMetricsSuite.scala   | 2 +-
 2 files changed, 5 insertions(+), 5 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-33026][SQL][FOLLOWUP] metrics name should be numOutputRows

2020-10-14 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new f3ad32f  [SPARK-33026][SQL][FOLLOWUP] metrics name should be 
numOutputRows
f3ad32f is described below

commit f3ad32f4b6fc55e89e7fb222ed565ad3e32d47c6
Author: Wenchen Fan 
AuthorDate: Wed Oct 14 16:17:28 2020 +

[SPARK-33026][SQL][FOLLOWUP] metrics name should be numOutputRows

### What changes were proposed in this pull request?

Follow the convention and rename the metrics `numRows` to `numOutputRows`

### Why are the changes needed?

`FilterExec`, `HashAggregateExec`, etc. all use `numOutputRows`

### Does this PR introduce _any_ user-facing change?

no

### How was this patch tested?

existing tests

Closes #30039 from cloud-fan/minor.

Authored-by: Wenchen Fan 
Signed-off-by: Wenchen Fan 
---
 .../spark/sql/execution/exchange/BroadcastExchangeExec.scala  | 8 
 .../org/apache/spark/sql/execution/metric/SQLMetricsSuite.scala   | 2 +-
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/BroadcastExchangeExec.scala
 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/BroadcastExchangeExec.scala
index 4b884df..0c5fee2 100644
--- 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/BroadcastExchangeExec.scala
+++ 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/BroadcastExchangeExec.scala
@@ -78,7 +78,7 @@ case class BroadcastExchangeExec(
 
   override lazy val metrics = Map(
 "dataSize" -> SQLMetrics.createSizeMetric(sparkContext, "data size"),
-"numRows" -> SQLMetrics.createMetric(sparkContext, "number of rows"),
+"numOutputRows" -> SQLMetrics.createMetric(sparkContext, "number of output 
rows"),
 "collectTime" -> SQLMetrics.createTimingMetric(sparkContext, "time to 
collect"),
 "buildTime" -> SQLMetrics.createTimingMetric(sparkContext, "time to 
build"),
 "broadcastTime" -> SQLMetrics.createTimingMetric(sparkContext, "time to 
broadcast"))
@@ -91,8 +91,8 @@ case class BroadcastExchangeExec(
 
   override def runtimeStatistics: Statistics = {
 val dataSize = metrics("dataSize").value
-val numRows = metrics("numRows").value
-Statistics(dataSize, Some(numRows))
+val rowCount = metrics("numOutputRows").value
+Statistics(dataSize, Some(rowCount))
   }
 
   @transient
@@ -116,11 +116,11 @@ case class BroadcastExchangeExec(
 val beforeCollect = System.nanoTime()
 // Use executeCollect/executeCollectIterator to avoid conversion 
to Scala types
 val (numRows, input) = child.executeCollectIterator()
+longMetric("numOutputRows") += numRows
 if (numRows >= MAX_BROADCAST_TABLE_ROWS) {
   throw new SparkException(
 s"Cannot broadcast the table over $MAX_BROADCAST_TABLE_ROWS 
rows: $numRows rows")
 }
-longMetric("numRows") += numRows
 
 val beforeBuild = System.nanoTime()
 longMetric("collectTime") += NANOSECONDS.toMillis(beforeBuild - 
beforeCollect)
diff --git 
a/sql/core/src/test/scala/org/apache/spark/sql/execution/metric/SQLMetricsSuite.scala
 
b/sql/core/src/test/scala/org/apache/spark/sql/execution/metric/SQLMetricsSuite.scala
index e404e46..4872906 100644
--- 
a/sql/core/src/test/scala/org/apache/spark/sql/execution/metric/SQLMetricsSuite.scala
+++ 
b/sql/core/src/test/scala/org/apache/spark/sql/execution/metric/SQLMetricsSuite.scala
@@ -751,7 +751,7 @@ class SQLMetricsSuite extends SharedSparkSession with 
SQLMetricsTestUtils
 }
 
 assert(exchanges.size === 1)
-testMetricsInSparkPlanOperator(exchanges.head, Map("numRows" -> 2))
+testMetricsInSparkPlanOperator(exchanges.head, Map("numOutputRows" -> 
2))
   }
 }
   }


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-33136][SQL] Fix mistakenly swapped parameter in V2WriteCommand.outputResolved

2020-10-14 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 2ebea13  [SPARK-33136][SQL] Fix mistakenly swapped parameter in 
V2WriteCommand.outputResolved
2ebea13 is described below

commit 2ebea135a1f4e4cb3187ffc8e77d3f52d3b3a91a
Author: Jungtaek Lim (HeartSaVioR) 
AuthorDate: Wed Oct 14 08:30:03 2020 -0700

[SPARK-33136][SQL] Fix mistakenly swapped parameter in 
V2WriteCommand.outputResolved

### What changes were proposed in this pull request?

This PR proposes to fix a bug on calling 
`DataType.equalsIgnoreCompatibleNullability` with mistakenly swapped parameters 
in `V2WriteCommand.outputResolved`. The order of parameters for 
`DataType.equalsIgnoreCompatibleNullability` are `from` and `to`, which says 
that the right order of matching variables are `inAttr` and `outAttr`.

### Why are the changes needed?

Spark throws AnalysisException due to unresolved operator in v2 write, 
while the operator is unresolved due to a bug that parameters to call 
`DataType.equalsIgnoreCompatibleNullability` in `outputResolved` have been 
swapped.

### Does this PR introduce _any_ user-facing change?

Yes, end users no longer suffer on unresolved operator in v2 write if 
they're trying to write dataframe containing non-nullable complex types against 
table matching complex types as nullable.

### How was this patch tested?

New UT added.

Closes #30033 from HeartSaVioR/SPARK-33136.

Authored-by: Jungtaek Lim (HeartSaVioR) 
Signed-off-by: Dongjoon Hyun 
(cherry picked from commit 8e5cb1d276686ec428e4e6aa1c3cfd6bb99e4e9a)
Signed-off-by: Dongjoon Hyun 
---
 .../sql/catalyst/plans/logical/v2Commands.scala|  2 +-
 .../apache/spark/sql/DataFrameWriterV2Suite.scala  | 87 +-
 2 files changed, 84 insertions(+), 5 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/v2Commands.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/v2Commands.scala
index b4120d9..d2830a9 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/v2Commands.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/v2Commands.scala
@@ -45,7 +45,7 @@ trait V2WriteCommand extends Command {
   case (inAttr, outAttr) =>
 // names and types must match, nullability must be compatible
 inAttr.name == outAttr.name &&
-  DataType.equalsIgnoreCompatibleNullability(outAttr.dataType, 
inAttr.dataType) &&
+  DataType.equalsIgnoreCompatibleNullability(inAttr.dataType, 
outAttr.dataType) &&
   (outAttr.nullable || !inAttr.nullable)
 }
 }
diff --git 
a/sql/core/src/test/scala/org/apache/spark/sql/DataFrameWriterV2Suite.scala 
b/sql/core/src/test/scala/org/apache/spark/sql/DataFrameWriterV2Suite.scala
index 508eefa..ff5c624 100644
--- a/sql/core/src/test/scala/org/apache/spark/sql/DataFrameWriterV2Suite.scala
+++ b/sql/core/src/test/scala/org/apache/spark/sql/DataFrameWriterV2Suite.scala
@@ -23,16 +23,15 @@ import scala.collection.JavaConverters._
 
 import org.scalatest.BeforeAndAfter
 
-import 
org.apache.spark.sql.catalyst.analysis.{CannotReplaceMissingTableException, 
NoSuchTableException, TableAlreadyExistsException}
-import org.apache.spark.sql.catalyst.plans.logical.{AppendData, LogicalPlan, 
OverwriteByExpression, OverwritePartitionsDynamic}
+import 
org.apache.spark.sql.catalyst.analysis.{CannotReplaceMissingTableException, 
NamedRelation, NoSuchTableException, TableAlreadyExistsException}
+import org.apache.spark.sql.catalyst.plans.logical.{AppendData, LogicalPlan, 
OverwriteByExpression, OverwritePartitionsDynamic, V2WriteCommand}
 import org.apache.spark.sql.connector.{InMemoryTable, InMemoryTableCatalog}
 import org.apache.spark.sql.connector.catalog.{Identifier, TableCatalog}
 import org.apache.spark.sql.connector.expressions.{BucketTransform, 
DaysTransform, FieldReference, HoursTransform, IdentityTransform, LiteralValue, 
MonthsTransform, YearsTransform}
 import org.apache.spark.sql.execution.QueryExecution
 import org.apache.spark.sql.execution.datasources.v2.DataSourceV2Relation
 import org.apache.spark.sql.test.SharedSparkSession
-import org.apache.spark.sql.types.{IntegerType, LongType, StringType, 
StructType}
-import org.apache.spark.sql.types.TimestampType
+import org.apache.spark.sql.types.{ArrayType, DataType, IntegerType, LongType, 
MapType, StringType, StructField, StructType, TimestampType}
 import org.apache.spark.sql.util.QueryExecutionListener
 import org.apache.spark.unsafe.types.UTF8String
 import org.apache.spark.util.Utils
@@ -101,6 +100,86 @@ class DataFrameWriterV2Suite extends QueryTest with

[spark] branch master updated (d8c4a47 -> 8e5cb1d)

2020-10-14 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from d8c4a47  [SPARK-33061][SQL] Expose inverse hyperbolic trig functions 
through sql.functions API
 add 8e5cb1d  [SPARK-33136][SQL] Fix mistakenly swapped parameter in 
V2WriteCommand.outputResolved

No new revisions were added by this update.

Summary of changes:
 .../sql/catalyst/plans/logical/v2Commands.scala|  2 +-
 .../apache/spark/sql/DataFrameWriterV2Suite.scala  | 87 +-
 2 files changed, 84 insertions(+), 5 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-33136][SQL] Fix mistakenly swapped parameter in V2WriteCommand.outputResolved

2020-10-14 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 2ebea13  [SPARK-33136][SQL] Fix mistakenly swapped parameter in 
V2WriteCommand.outputResolved
2ebea13 is described below

commit 2ebea135a1f4e4cb3187ffc8e77d3f52d3b3a91a
Author: Jungtaek Lim (HeartSaVioR) 
AuthorDate: Wed Oct 14 08:30:03 2020 -0700

[SPARK-33136][SQL] Fix mistakenly swapped parameter in 
V2WriteCommand.outputResolved

### What changes were proposed in this pull request?

This PR proposes to fix a bug on calling 
`DataType.equalsIgnoreCompatibleNullability` with mistakenly swapped parameters 
in `V2WriteCommand.outputResolved`. The order of parameters for 
`DataType.equalsIgnoreCompatibleNullability` are `from` and `to`, which says 
that the right order of matching variables are `inAttr` and `outAttr`.

### Why are the changes needed?

Spark throws AnalysisException due to unresolved operator in v2 write, 
while the operator is unresolved due to a bug that parameters to call 
`DataType.equalsIgnoreCompatibleNullability` in `outputResolved` have been 
swapped.

### Does this PR introduce _any_ user-facing change?

Yes, end users no longer suffer on unresolved operator in v2 write if 
they're trying to write dataframe containing non-nullable complex types against 
table matching complex types as nullable.

### How was this patch tested?

New UT added.

Closes #30033 from HeartSaVioR/SPARK-33136.

Authored-by: Jungtaek Lim (HeartSaVioR) 
Signed-off-by: Dongjoon Hyun 
(cherry picked from commit 8e5cb1d276686ec428e4e6aa1c3cfd6bb99e4e9a)
Signed-off-by: Dongjoon Hyun 
---
 .../sql/catalyst/plans/logical/v2Commands.scala|  2 +-
 .../apache/spark/sql/DataFrameWriterV2Suite.scala  | 87 +-
 2 files changed, 84 insertions(+), 5 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/v2Commands.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/v2Commands.scala
index b4120d9..d2830a9 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/v2Commands.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/v2Commands.scala
@@ -45,7 +45,7 @@ trait V2WriteCommand extends Command {
   case (inAttr, outAttr) =>
 // names and types must match, nullability must be compatible
 inAttr.name == outAttr.name &&
-  DataType.equalsIgnoreCompatibleNullability(outAttr.dataType, 
inAttr.dataType) &&
+  DataType.equalsIgnoreCompatibleNullability(inAttr.dataType, 
outAttr.dataType) &&
   (outAttr.nullable || !inAttr.nullable)
 }
 }
diff --git 
a/sql/core/src/test/scala/org/apache/spark/sql/DataFrameWriterV2Suite.scala 
b/sql/core/src/test/scala/org/apache/spark/sql/DataFrameWriterV2Suite.scala
index 508eefa..ff5c624 100644
--- a/sql/core/src/test/scala/org/apache/spark/sql/DataFrameWriterV2Suite.scala
+++ b/sql/core/src/test/scala/org/apache/spark/sql/DataFrameWriterV2Suite.scala
@@ -23,16 +23,15 @@ import scala.collection.JavaConverters._
 
 import org.scalatest.BeforeAndAfter
 
-import 
org.apache.spark.sql.catalyst.analysis.{CannotReplaceMissingTableException, 
NoSuchTableException, TableAlreadyExistsException}
-import org.apache.spark.sql.catalyst.plans.logical.{AppendData, LogicalPlan, 
OverwriteByExpression, OverwritePartitionsDynamic}
+import 
org.apache.spark.sql.catalyst.analysis.{CannotReplaceMissingTableException, 
NamedRelation, NoSuchTableException, TableAlreadyExistsException}
+import org.apache.spark.sql.catalyst.plans.logical.{AppendData, LogicalPlan, 
OverwriteByExpression, OverwritePartitionsDynamic, V2WriteCommand}
 import org.apache.spark.sql.connector.{InMemoryTable, InMemoryTableCatalog}
 import org.apache.spark.sql.connector.catalog.{Identifier, TableCatalog}
 import org.apache.spark.sql.connector.expressions.{BucketTransform, 
DaysTransform, FieldReference, HoursTransform, IdentityTransform, LiteralValue, 
MonthsTransform, YearsTransform}
 import org.apache.spark.sql.execution.QueryExecution
 import org.apache.spark.sql.execution.datasources.v2.DataSourceV2Relation
 import org.apache.spark.sql.test.SharedSparkSession
-import org.apache.spark.sql.types.{IntegerType, LongType, StringType, 
StructType}
-import org.apache.spark.sql.types.TimestampType
+import org.apache.spark.sql.types.{ArrayType, DataType, IntegerType, LongType, 
MapType, StringType, StructField, StructType, TimestampType}
 import org.apache.spark.sql.util.QueryExecutionListener
 import org.apache.spark.unsafe.types.UTF8String
 import org.apache.spark.util.Utils
@@ -101,6 +100,86 @@ class DataFrameWriterV2Suite extends QueryTest with

[spark] branch master updated (d8c4a47 -> 8e5cb1d)

2020-10-14 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from d8c4a47  [SPARK-33061][SQL] Expose inverse hyperbolic trig functions 
through sql.functions API
 add 8e5cb1d  [SPARK-33136][SQL] Fix mistakenly swapped parameter in 
V2WriteCommand.outputResolved

No new revisions were added by this update.

Summary of changes:
 .../sql/catalyst/plans/logical/v2Commands.scala|  2 +-
 .../apache/spark/sql/DataFrameWriterV2Suite.scala  | 87 +-
 2 files changed, 84 insertions(+), 5 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-33136][SQL] Fix mistakenly swapped parameter in V2WriteCommand.outputResolved

2020-10-14 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 2ebea13  [SPARK-33136][SQL] Fix mistakenly swapped parameter in 
V2WriteCommand.outputResolved
2ebea13 is described below

commit 2ebea135a1f4e4cb3187ffc8e77d3f52d3b3a91a
Author: Jungtaek Lim (HeartSaVioR) 
AuthorDate: Wed Oct 14 08:30:03 2020 -0700

[SPARK-33136][SQL] Fix mistakenly swapped parameter in 
V2WriteCommand.outputResolved

### What changes were proposed in this pull request?

This PR proposes to fix a bug on calling 
`DataType.equalsIgnoreCompatibleNullability` with mistakenly swapped parameters 
in `V2WriteCommand.outputResolved`. The order of parameters for 
`DataType.equalsIgnoreCompatibleNullability` are `from` and `to`, which says 
that the right order of matching variables are `inAttr` and `outAttr`.

### Why are the changes needed?

Spark throws AnalysisException due to unresolved operator in v2 write, 
while the operator is unresolved due to a bug that parameters to call 
`DataType.equalsIgnoreCompatibleNullability` in `outputResolved` have been 
swapped.

### Does this PR introduce _any_ user-facing change?

Yes, end users no longer suffer on unresolved operator in v2 write if 
they're trying to write dataframe containing non-nullable complex types against 
table matching complex types as nullable.

### How was this patch tested?

New UT added.

Closes #30033 from HeartSaVioR/SPARK-33136.

Authored-by: Jungtaek Lim (HeartSaVioR) 
Signed-off-by: Dongjoon Hyun 
(cherry picked from commit 8e5cb1d276686ec428e4e6aa1c3cfd6bb99e4e9a)
Signed-off-by: Dongjoon Hyun 
---
 .../sql/catalyst/plans/logical/v2Commands.scala|  2 +-
 .../apache/spark/sql/DataFrameWriterV2Suite.scala  | 87 +-
 2 files changed, 84 insertions(+), 5 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/v2Commands.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/v2Commands.scala
index b4120d9..d2830a9 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/v2Commands.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/v2Commands.scala
@@ -45,7 +45,7 @@ trait V2WriteCommand extends Command {
   case (inAttr, outAttr) =>
 // names and types must match, nullability must be compatible
 inAttr.name == outAttr.name &&
-  DataType.equalsIgnoreCompatibleNullability(outAttr.dataType, 
inAttr.dataType) &&
+  DataType.equalsIgnoreCompatibleNullability(inAttr.dataType, 
outAttr.dataType) &&
   (outAttr.nullable || !inAttr.nullable)
 }
 }
diff --git 
a/sql/core/src/test/scala/org/apache/spark/sql/DataFrameWriterV2Suite.scala 
b/sql/core/src/test/scala/org/apache/spark/sql/DataFrameWriterV2Suite.scala
index 508eefa..ff5c624 100644
--- a/sql/core/src/test/scala/org/apache/spark/sql/DataFrameWriterV2Suite.scala
+++ b/sql/core/src/test/scala/org/apache/spark/sql/DataFrameWriterV2Suite.scala
@@ -23,16 +23,15 @@ import scala.collection.JavaConverters._
 
 import org.scalatest.BeforeAndAfter
 
-import 
org.apache.spark.sql.catalyst.analysis.{CannotReplaceMissingTableException, 
NoSuchTableException, TableAlreadyExistsException}
-import org.apache.spark.sql.catalyst.plans.logical.{AppendData, LogicalPlan, 
OverwriteByExpression, OverwritePartitionsDynamic}
+import 
org.apache.spark.sql.catalyst.analysis.{CannotReplaceMissingTableException, 
NamedRelation, NoSuchTableException, TableAlreadyExistsException}
+import org.apache.spark.sql.catalyst.plans.logical.{AppendData, LogicalPlan, 
OverwriteByExpression, OverwritePartitionsDynamic, V2WriteCommand}
 import org.apache.spark.sql.connector.{InMemoryTable, InMemoryTableCatalog}
 import org.apache.spark.sql.connector.catalog.{Identifier, TableCatalog}
 import org.apache.spark.sql.connector.expressions.{BucketTransform, 
DaysTransform, FieldReference, HoursTransform, IdentityTransform, LiteralValue, 
MonthsTransform, YearsTransform}
 import org.apache.spark.sql.execution.QueryExecution
 import org.apache.spark.sql.execution.datasources.v2.DataSourceV2Relation
 import org.apache.spark.sql.test.SharedSparkSession
-import org.apache.spark.sql.types.{IntegerType, LongType, StringType, 
StructType}
-import org.apache.spark.sql.types.TimestampType
+import org.apache.spark.sql.types.{ArrayType, DataType, IntegerType, LongType, 
MapType, StringType, StructField, StructType, TimestampType}
 import org.apache.spark.sql.util.QueryExecutionListener
 import org.apache.spark.unsafe.types.UTF8String
 import org.apache.spark.util.Utils
@@ -101,6 +100,86 @@ class DataFrameWriterV2Suite extends QueryTest with

[spark] branch master updated (d8c4a47 -> 8e5cb1d)

2020-10-14 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from d8c4a47  [SPARK-33061][SQL] Expose inverse hyperbolic trig functions 
through sql.functions API
 add 8e5cb1d  [SPARK-33136][SQL] Fix mistakenly swapped parameter in 
V2WriteCommand.outputResolved

No new revisions were added by this update.

Summary of changes:
 .../sql/catalyst/plans/logical/v2Commands.scala|  2 +-
 .../apache/spark/sql/DataFrameWriterV2Suite.scala  | 87 +-
 2 files changed, 84 insertions(+), 5 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-33136][SQL] Fix mistakenly swapped parameter in V2WriteCommand.outputResolved

2020-10-14 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 2ebea13  [SPARK-33136][SQL] Fix mistakenly swapped parameter in 
V2WriteCommand.outputResolved
2ebea13 is described below

commit 2ebea135a1f4e4cb3187ffc8e77d3f52d3b3a91a
Author: Jungtaek Lim (HeartSaVioR) 
AuthorDate: Wed Oct 14 08:30:03 2020 -0700

[SPARK-33136][SQL] Fix mistakenly swapped parameter in 
V2WriteCommand.outputResolved

### What changes were proposed in this pull request?

This PR proposes to fix a bug on calling 
`DataType.equalsIgnoreCompatibleNullability` with mistakenly swapped parameters 
in `V2WriteCommand.outputResolved`. The order of parameters for 
`DataType.equalsIgnoreCompatibleNullability` are `from` and `to`, which says 
that the right order of matching variables are `inAttr` and `outAttr`.

### Why are the changes needed?

Spark throws AnalysisException due to unresolved operator in v2 write, 
while the operator is unresolved due to a bug that parameters to call 
`DataType.equalsIgnoreCompatibleNullability` in `outputResolved` have been 
swapped.

### Does this PR introduce _any_ user-facing change?

Yes, end users no longer suffer on unresolved operator in v2 write if 
they're trying to write dataframe containing non-nullable complex types against 
table matching complex types as nullable.

### How was this patch tested?

New UT added.

Closes #30033 from HeartSaVioR/SPARK-33136.

Authored-by: Jungtaek Lim (HeartSaVioR) 
Signed-off-by: Dongjoon Hyun 
(cherry picked from commit 8e5cb1d276686ec428e4e6aa1c3cfd6bb99e4e9a)
Signed-off-by: Dongjoon Hyun 
---
 .../sql/catalyst/plans/logical/v2Commands.scala|  2 +-
 .../apache/spark/sql/DataFrameWriterV2Suite.scala  | 87 +-
 2 files changed, 84 insertions(+), 5 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/v2Commands.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/v2Commands.scala
index b4120d9..d2830a9 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/v2Commands.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/v2Commands.scala
@@ -45,7 +45,7 @@ trait V2WriteCommand extends Command {
   case (inAttr, outAttr) =>
 // names and types must match, nullability must be compatible
 inAttr.name == outAttr.name &&
-  DataType.equalsIgnoreCompatibleNullability(outAttr.dataType, 
inAttr.dataType) &&
+  DataType.equalsIgnoreCompatibleNullability(inAttr.dataType, 
outAttr.dataType) &&
   (outAttr.nullable || !inAttr.nullable)
 }
 }
diff --git 
a/sql/core/src/test/scala/org/apache/spark/sql/DataFrameWriterV2Suite.scala 
b/sql/core/src/test/scala/org/apache/spark/sql/DataFrameWriterV2Suite.scala
index 508eefa..ff5c624 100644
--- a/sql/core/src/test/scala/org/apache/spark/sql/DataFrameWriterV2Suite.scala
+++ b/sql/core/src/test/scala/org/apache/spark/sql/DataFrameWriterV2Suite.scala
@@ -23,16 +23,15 @@ import scala.collection.JavaConverters._
 
 import org.scalatest.BeforeAndAfter
 
-import 
org.apache.spark.sql.catalyst.analysis.{CannotReplaceMissingTableException, 
NoSuchTableException, TableAlreadyExistsException}
-import org.apache.spark.sql.catalyst.plans.logical.{AppendData, LogicalPlan, 
OverwriteByExpression, OverwritePartitionsDynamic}
+import 
org.apache.spark.sql.catalyst.analysis.{CannotReplaceMissingTableException, 
NamedRelation, NoSuchTableException, TableAlreadyExistsException}
+import org.apache.spark.sql.catalyst.plans.logical.{AppendData, LogicalPlan, 
OverwriteByExpression, OverwritePartitionsDynamic, V2WriteCommand}
 import org.apache.spark.sql.connector.{InMemoryTable, InMemoryTableCatalog}
 import org.apache.spark.sql.connector.catalog.{Identifier, TableCatalog}
 import org.apache.spark.sql.connector.expressions.{BucketTransform, 
DaysTransform, FieldReference, HoursTransform, IdentityTransform, LiteralValue, 
MonthsTransform, YearsTransform}
 import org.apache.spark.sql.execution.QueryExecution
 import org.apache.spark.sql.execution.datasources.v2.DataSourceV2Relation
 import org.apache.spark.sql.test.SharedSparkSession
-import org.apache.spark.sql.types.{IntegerType, LongType, StringType, 
StructType}
-import org.apache.spark.sql.types.TimestampType
+import org.apache.spark.sql.types.{ArrayType, DataType, IntegerType, LongType, 
MapType, StringType, StructField, StructType, TimestampType}
 import org.apache.spark.sql.util.QueryExecutionListener
 import org.apache.spark.unsafe.types.UTF8String
 import org.apache.spark.util.Utils
@@ -101,6 +100,86 @@ class DataFrameWriterV2Suite extends QueryTest with

[spark] branch master updated (d8c4a47 -> 8e5cb1d)

2020-10-14 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from d8c4a47  [SPARK-33061][SQL] Expose inverse hyperbolic trig functions 
through sql.functions API
 add 8e5cb1d  [SPARK-33136][SQL] Fix mistakenly swapped parameter in 
V2WriteCommand.outputResolved

No new revisions were added by this update.

Summary of changes:
 .../sql/catalyst/plans/logical/v2Commands.scala|  2 +-
 .../apache/spark/sql/DataFrameWriterV2Suite.scala  | 87 +-
 2 files changed, 84 insertions(+), 5 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-33136][SQL] Fix mistakenly swapped parameter in V2WriteCommand.outputResolved

2020-10-14 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 2ebea13  [SPARK-33136][SQL] Fix mistakenly swapped parameter in 
V2WriteCommand.outputResolved
2ebea13 is described below

commit 2ebea135a1f4e4cb3187ffc8e77d3f52d3b3a91a
Author: Jungtaek Lim (HeartSaVioR) 
AuthorDate: Wed Oct 14 08:30:03 2020 -0700

[SPARK-33136][SQL] Fix mistakenly swapped parameter in 
V2WriteCommand.outputResolved

### What changes were proposed in this pull request?

This PR proposes to fix a bug on calling 
`DataType.equalsIgnoreCompatibleNullability` with mistakenly swapped parameters 
in `V2WriteCommand.outputResolved`. The order of parameters for 
`DataType.equalsIgnoreCompatibleNullability` are `from` and `to`, which says 
that the right order of matching variables are `inAttr` and `outAttr`.

### Why are the changes needed?

Spark throws AnalysisException due to unresolved operator in v2 write, 
while the operator is unresolved due to a bug that parameters to call 
`DataType.equalsIgnoreCompatibleNullability` in `outputResolved` have been 
swapped.

### Does this PR introduce _any_ user-facing change?

Yes, end users no longer suffer on unresolved operator in v2 write if 
they're trying to write dataframe containing non-nullable complex types against 
table matching complex types as nullable.

### How was this patch tested?

New UT added.

Closes #30033 from HeartSaVioR/SPARK-33136.

Authored-by: Jungtaek Lim (HeartSaVioR) 
Signed-off-by: Dongjoon Hyun 
(cherry picked from commit 8e5cb1d276686ec428e4e6aa1c3cfd6bb99e4e9a)
Signed-off-by: Dongjoon Hyun 
---
 .../sql/catalyst/plans/logical/v2Commands.scala|  2 +-
 .../apache/spark/sql/DataFrameWriterV2Suite.scala  | 87 +-
 2 files changed, 84 insertions(+), 5 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/v2Commands.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/v2Commands.scala
index b4120d9..d2830a9 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/v2Commands.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/v2Commands.scala
@@ -45,7 +45,7 @@ trait V2WriteCommand extends Command {
   case (inAttr, outAttr) =>
 // names and types must match, nullability must be compatible
 inAttr.name == outAttr.name &&
-  DataType.equalsIgnoreCompatibleNullability(outAttr.dataType, 
inAttr.dataType) &&
+  DataType.equalsIgnoreCompatibleNullability(inAttr.dataType, 
outAttr.dataType) &&
   (outAttr.nullable || !inAttr.nullable)
 }
 }
diff --git 
a/sql/core/src/test/scala/org/apache/spark/sql/DataFrameWriterV2Suite.scala 
b/sql/core/src/test/scala/org/apache/spark/sql/DataFrameWriterV2Suite.scala
index 508eefa..ff5c624 100644
--- a/sql/core/src/test/scala/org/apache/spark/sql/DataFrameWriterV2Suite.scala
+++ b/sql/core/src/test/scala/org/apache/spark/sql/DataFrameWriterV2Suite.scala
@@ -23,16 +23,15 @@ import scala.collection.JavaConverters._
 
 import org.scalatest.BeforeAndAfter
 
-import 
org.apache.spark.sql.catalyst.analysis.{CannotReplaceMissingTableException, 
NoSuchTableException, TableAlreadyExistsException}
-import org.apache.spark.sql.catalyst.plans.logical.{AppendData, LogicalPlan, 
OverwriteByExpression, OverwritePartitionsDynamic}
+import 
org.apache.spark.sql.catalyst.analysis.{CannotReplaceMissingTableException, 
NamedRelation, NoSuchTableException, TableAlreadyExistsException}
+import org.apache.spark.sql.catalyst.plans.logical.{AppendData, LogicalPlan, 
OverwriteByExpression, OverwritePartitionsDynamic, V2WriteCommand}
 import org.apache.spark.sql.connector.{InMemoryTable, InMemoryTableCatalog}
 import org.apache.spark.sql.connector.catalog.{Identifier, TableCatalog}
 import org.apache.spark.sql.connector.expressions.{BucketTransform, 
DaysTransform, FieldReference, HoursTransform, IdentityTransform, LiteralValue, 
MonthsTransform, YearsTransform}
 import org.apache.spark.sql.execution.QueryExecution
 import org.apache.spark.sql.execution.datasources.v2.DataSourceV2Relation
 import org.apache.spark.sql.test.SharedSparkSession
-import org.apache.spark.sql.types.{IntegerType, LongType, StringType, 
StructType}
-import org.apache.spark.sql.types.TimestampType
+import org.apache.spark.sql.types.{ArrayType, DataType, IntegerType, LongType, 
MapType, StringType, StructField, StructType, TimestampType}
 import org.apache.spark.sql.util.QueryExecutionListener
 import org.apache.spark.unsafe.types.UTF8String
 import org.apache.spark.util.Utils
@@ -101,6 +100,86 @@ class DataFrameWriterV2Suite extends QueryTest with

[spark] branch master updated (d8c4a47 -> 8e5cb1d)

2020-10-14 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from d8c4a47  [SPARK-33061][SQL] Expose inverse hyperbolic trig functions 
through sql.functions API
 add 8e5cb1d  [SPARK-33136][SQL] Fix mistakenly swapped parameter in 
V2WriteCommand.outputResolved

No new revisions were added by this update.

Summary of changes:
 .../sql/catalyst/plans/logical/v2Commands.scala|  2 +-
 .../apache/spark/sql/DataFrameWriterV2Suite.scala  | 87 +-
 2 files changed, 84 insertions(+), 5 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (05a62dc -> d8c4a47)

2020-10-14 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 05a62dc  [SPARK-33134][SQL] Return partial results only for root JSON 
objects
 add d8c4a47  [SPARK-33061][SQL] Expose inverse hyperbolic trig functions 
through sql.functions API

No new revisions were added by this update.

Summary of changes:
 .../scala/org/apache/spark/sql/functions.scala | 50 +-
 .../org/apache/spark/sql/MathFunctionsSuite.scala  | 15 +++
 2 files changed, 64 insertions(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (05a62dc -> d8c4a47)

2020-10-14 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 05a62dc  [SPARK-33134][SQL] Return partial results only for root JSON 
objects
 add d8c4a47  [SPARK-33061][SQL] Expose inverse hyperbolic trig functions 
through sql.functions API

No new revisions were added by this update.

Summary of changes:
 .../scala/org/apache/spark/sql/functions.scala | 50 +-
 .../org/apache/spark/sql/MathFunctionsSuite.scala  | 15 +++
 2 files changed, 64 insertions(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (05a62dc -> d8c4a47)

2020-10-14 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 05a62dc  [SPARK-33134][SQL] Return partial results only for root JSON 
objects
 add d8c4a47  [SPARK-33061][SQL] Expose inverse hyperbolic trig functions 
through sql.functions API

No new revisions were added by this update.

Summary of changes:
 .../scala/org/apache/spark/sql/functions.scala | 50 +-
 .../org/apache/spark/sql/MathFunctionsSuite.scala  | 15 +++
 2 files changed, 64 insertions(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (05a62dc -> d8c4a47)

2020-10-14 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 05a62dc  [SPARK-33134][SQL] Return partial results only for root JSON 
objects
 add d8c4a47  [SPARK-33061][SQL] Expose inverse hyperbolic trig functions 
through sql.functions API

No new revisions were added by this update.

Summary of changes:
 .../scala/org/apache/spark/sql/functions.scala | 50 +-
 .../org/apache/spark/sql/MathFunctionsSuite.scala  | 15 +++
 2 files changed, 64 insertions(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (05a62dc -> d8c4a47)

2020-10-14 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 05a62dc  [SPARK-33134][SQL] Return partial results only for root JSON 
objects
 add d8c4a47  [SPARK-33061][SQL] Expose inverse hyperbolic trig functions 
through sql.functions API

No new revisions were added by this update.

Summary of changes:
 .../scala/org/apache/spark/sql/functions.scala | 50 +-
 .../org/apache/spark/sql/MathFunctionsSuite.scala  | 15 +++
 2 files changed, 64 insertions(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

52 matches

Mail list logo