[GitHub] spark pull request #20564: [SPARK-23378][SQL] move setCurrentDatabase from H...

2018-02-10 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/20564#discussion_r167392786
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala 
---
@@ -598,8 +598,22 @@ private[hive] class HiveClientImpl(
   db: String,
   table: String,
   newParts: Seq[CatalogTablePartition]): Unit = withHiveState {
-val hiveTable = toHiveTable(getTable(db, table), Some(userName))
-shim.alterPartitions(client, table, newParts.map { p => 
toHivePartition(p, hiveTable) }.asJava)
+// Note: Before altering table partitions in Hive, you *must* set the 
current database
+// to the one that contains the table of interest. Otherwise you will 
end up with the
+// most helpful error message ever: "Unable to alter partition. alter 
is not possible."
+// See HIVE-2742 for more detail.
+val original = state.getCurrentDatabase
+if (databaseExists(db)) {
+  state.setCurrentDatabase(db)
+} else {
+  throw new NoSuchDatabaseException(db)
+}
--- End diff --

Instead of copying the codes from `setCurrentDatabase`, why not directly 
calling it?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20564: [SPARK-23378][SQL] move setCurrentDatabase from H...

2018-02-10 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/20564#discussion_r167392724
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala ---
@@ -1107,11 +1107,6 @@ private[spark] class HiveExternalCatalog(conf: 
SparkConf, hadoopConf: Configurat
   }
 }
 
-// Note: Before altering table partitions in Hive, you *must* set the 
current database
-// to the one that contains the table of interest. Otherwise you will 
end up with the
-// most helpful error message ever: "Unable to alter partition. alter 
is not possible."
-// See HIVE-2742 for more detail.
-client.setCurrentDatabase(db)
--- End diff --

This does not change the behavior. Just to restore the original current 
database in the hive client. I do not think the current implementation has any 
issue, because we do not rely on the current database in the hive client. 
However, I am also fine to restore the original. This change does not have any 
harm but just make the codes more stable for the future refactoring.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20407: [SPARK-23124][SQL] Allow to disable BroadcastNest...

2018-02-10 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request:

https://github.com/apache/spark/pull/20407#discussion_r167392674
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ---
@@ -156,6 +156,15 @@ object SQLConf {
 .booleanConf
 .createWithDefault(true)
 
+  val ALLOW_NESTEDJOIN_FALLBACK = 
buildConf("spark.sql.join.broadcastJoinFallback.enabled")
+.internal()
+.doc("When true (default), if the other options are not available, 
fallback to try and use " +
+  "BroadcastNestedLoopJoin as join strategy. This can cause OOM which 
can be a problem " +
+  "in some scenarios, eg. when running the thriftserver. Turn to false 
to disable it: an " +
+  "AnalysisException will be thrown.")
--- End diff --

I see and in general I agree with you, but in this case 
`BroadcastNestedLoopJoin` is used as a fallback even though we are over the 
threshold for the broadcast joins. So it is very likely that this is going to 
throw an OOM.
So the rationale of this choice seems to me: I don't have any working join 
implementation, then just go for this one even though I know in advance that a 
OOM is likely to occur; anyway I don't have other choices, so if the OOM won't 
occur I have been lucky and I have been able to run it, otherwise it throws an 
OOM instead of an `AnalysisException` (so I am not able to run this scenario 
anyway). And this is fine if we think to normal Spark application, while for 
long running ones like STS an OOM is much worse than an `AnalysisException`.
Therefore, I know that this is not the solution to all OOM problem, but the 
reason of this PR is that I think that when this rule was introduced, the 
choice was made "forgetting" of use cases like STS. Do you agree with me?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17185: [SPARK-19602][SQL] Support column resolution of fully qu...

2018-02-10 Thread mafernandez-stratio
Github user mafernandez-stratio commented on the issue:

https://github.com/apache/spark/pull/17185
  
@gatorsmile Great, thanks for the feedback!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20407: [SPARK-23124][SQL] Allow to disable BroadcastNest...

2018-02-10 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/20407#discussion_r167392509
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ---
@@ -156,6 +156,15 @@ object SQLConf {
 .booleanConf
 .createWithDefault(true)
 
+  val ALLOW_NESTEDJOIN_FALLBACK = 
buildConf("spark.sql.join.broadcastJoinFallback.enabled")
+.internal()
+.doc("When true (default), if the other options are not available, 
fallback to try and use " +
+  "BroadcastNestedLoopJoin as join strategy. This can cause OOM which 
can be a problem " +
+  "in some scenarios, eg. when running the thriftserver. Turn to false 
to disable it: an " +
+  "AnalysisException will be thrown.")
--- End diff --

OOM could be triggered by various cases. `BroadcastNestedLoopJoin` is being 
widely used in many cases. If this is the case in your scenario, you can do it 
in your fork.

Also, this PR just helps a very specific case. I do not think it resolves 
the general issue.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20531: [SPARK-23352][PYTHON] Explicitly specify supported types...

2018-02-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20531
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20561: [SPARK-23376][SQL] creating UnsafeKVExternalSorter with ...

2018-02-10 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20561
  
**[Test build #87279 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87279/testReport)**
 for PR 20561 at commit 
[`8dab79a`](https://github.com/apache/spark/commit/8dab79a46d9201ec0e43e60d6ef841bf91f4c616).
 * This patch **fails due to an unknown error code, -9**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20561: [SPARK-23376][SQL] creating UnsafeKVExternalSorter with ...

2018-02-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20561
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20531: [SPARK-23352][PYTHON] Explicitly specify supported types...

2018-02-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20531
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87280/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20561: [SPARK-23376][SQL] creating UnsafeKVExternalSorter with ...

2018-02-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20561
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87279/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20531: [SPARK-23352][PYTHON] Explicitly specify supported types...

2018-02-10 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20531
  
**[Test build #87280 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87280/testReport)**
 for PR 20531 at commit 
[`07f2d78`](https://github.com/apache/spark/commit/07f2d781022a44186fe5ab4c5621d80acce0711f).
 * This patch **fails due to an unknown error code, -9**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



<    1   2   3