[GitHub] spark issue #22186: [SPARK-25183][SQL][WIP] Spark HiveServer2 to use Spark S...

2018-08-29 Thread steveloughran
Github user steveloughran commented on the issue:

https://github.com/apache/spark/pull/22186
  
This will eliminate a race condition between FS shutdown (in the hadoop 
shutdown manager) and the hive callback. Theres a risk today that the 
filesystems will be closed before that event log close()/rename() is called, so 
things don't get saved —and this can happen with any FS.

registering the shutdown hook via the spark APIs, with a priority > than 
the FS shutdown, guarantees that it will be called before the FS shutdown. But 
it doesn't guarantee that the operation will complete within the 10s time limit 
hard coded into Hadoop 2.8.x+ for any single shutdown hook to complete. It is 
going to work in HDFS except in the special case of HDFS NN lock or GC pause.

The Hadoop configurable delay of 
[HADOOP-15679](https://issues.apache.org/jira/browse/HADOOP-15679) needs to go 
in. I've increased the default timeout to 30s there for more forgiveness with 
HDFS, and for object stores with O(data) renames people should configure it 
with a timeout of minutes, or, if they want to turn it off altogether, hours. 

I'm backporting HADOOP-15679 to all branches 2.8.x+, so all hadoop versions 
with that timeout will have the timeout configurable & the default time 
extended.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22186: [SPARK-25183][SQL][WIP] Spark HiveServer2 to use Spark S...

2018-08-29 Thread jerryshao
Github user jerryshao commented on the issue:

https://github.com/apache/spark/pull/22186
  
The fix itself LGTM, but I don't think this could solve the STS shutdown 
hook conflict problem with Hadoop.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22186: [SPARK-25183][SQL][WIP] Spark HiveServer2 to use Spark S...

2018-08-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22186
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22186: [SPARK-25183][SQL][WIP] Spark HiveServer2 to use Spark S...

2018-08-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22186
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/95385/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22186: [SPARK-25183][SQL][WIP] Spark HiveServer2 to use Spark S...

2018-08-28 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22186
  
**[Test build #95385 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95385/testReport)**
 for PR 22186 at commit 
[`fbced52`](https://github.com/apache/spark/commit/fbced52e5687cd5eb6a06c3b9bca5cbeb9343002).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22186: [SPARK-25183][SQL][WIP] Spark HiveServer2 to use Spark S...

2018-08-28 Thread steveloughran
Github user steveloughran commented on the issue:

https://github.com/apache/spark/pull/22186
  
The latest patch builds locally

Maven test outcome
* lots of json missing method errors, clearly jackson version problems of 
some kind
* I don't see log messages of hive shutdown appearing in the output, though 
after all the tests finish I do get a log showing the FS cleanup is going on

```
18/08/28 22:09:58 INFO ShutdownHookManager: Shutdown hook called
18/08/28 22:09:58 INFO ShutdownHookManager: Deleting directory 
...spark/sql/hive-thriftserver/target/tmp/
```

I think it might be possible to actually test whether the shutdown hook was 
added by calling remove(hook) in a test and verifying that the hook was found, 
that is : it was registered. Some caching of the hook and a package-level 
removeHook method in the HiveServer, though wiring it all the way up to a test 
case would be tricky...


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22186: [SPARK-25183][SQL][WIP] Spark HiveServer2 to use Spark S...

2018-08-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22186
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22186: [SPARK-25183][SQL][WIP] Spark HiveServer2 to use Spark S...

2018-08-28 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22186
  
**[Test build #95385 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95385/testReport)**
 for PR 22186 at commit 
[`fbced52`](https://github.com/apache/spark/commit/fbced52e5687cd5eb6a06c3b9bca5cbeb9343002).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22186: [SPARK-25183][SQL][WIP] Spark HiveServer2 to use Spark S...

2018-08-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22186
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2646/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22186: [SPARK-25183][SQL][WIP] Spark HiveServer2 to use Spark S...

2018-08-28 Thread steveloughran
Github user steveloughran commented on the issue:

https://github.com/apache/spark/pull/22186
  
My local build wasn't including that module; it now does and the link works 
with a subclass of `AbstractFunction0`.

The local tests are failing under maven with hive/jackson mismatch though. 
I'm going to consider that a separate issue.

```
#c=cvalue;d=dvalue
- SPARK-16563 ThriftCLIService FetchResults repeat fetching result *** 
FAILED ***
  java.sql.SQLException: java.lang.NoSuchMethodError: 
org.json4s.jackson.JsonMethods$.parse$default$3()Z
  at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:296)
  at 
org.apache.spark.sql.hive.thriftserver.HiveThriftJdbcTest$$anonfun$withMultipleConnectionJdbcStatement$2.apply(HiveThriftServer2Suites.scala:814)
  at 
org.apache.spark.sql.hive.thriftserver.HiveThriftJdbcTest$$anonfun$withMultipleConnectionJdbcStatement$2.apply(HiveThriftServer2Suites.scala:813)
  at 
scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
  at scala.collection.mutable.WrappedArray.foreach(WrappedArray.scala:35)
  at 
org.apache.spark.sql.hive.thriftserver.HiveThriftJdbcTest.withMultipleConnectionJdbcStatement(HiveThriftServer2Suites.scala:813)
  at 
org.apache.spark.sql.hive.thriftserver.HiveThriftJdbcTest.withJdbcStatement(HiveThriftServer2Suites.scala:822)
  at 
org.apache.spark.sql.hive.thriftserver.HiveThriftBinaryServerSuite$$anonfun$2$$anonfun$apply$mcV$sp$2.apply(HiveThriftServer2Suites.scala:100)
  at 
org.apache.spark.sql.hive.thriftserver.HiveThriftBinaryServerSuite$$anonfun$2$$anonfun$apply$mcV$sp$2.apply(HiveThriftServer2Suites.scala:96)
  at 
org.apache.spark.sql.hive.thriftserver.HiveThriftBinaryServerSuite.org$apache$spark$sql$hive$thriftserver$HiveThriftBinaryServerSuite$$withCLIServiceClient(HiveThriftServer2Suites.scala:71)
```


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22186: [SPARK-25183][SQL][WIP] Spark HiveServer2 to use Spark S...

2018-08-26 Thread steveloughran
Github user steveloughran commented on the issue:

https://github.com/apache/spark/pull/22186
  
my local maven build *did* work, so maybe its a javac/JVM version thing. 
Will move back to a java class callback.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22186: [SPARK-25183][SQL][WIP] Spark HiveServer2 to use Spark S...

2018-08-24 Thread jerryshao
Github user jerryshao commented on the issue:

https://github.com/apache/spark/pull/22186
  
My local maven build also failed.

I think the problem is that`ShutdownHookManager` is implemented in Scala, 
the complied method signature may be different when invoked from Java, I'm not 
sure how Scala anonymous function is translated to Java, but it seems like due 
to this issue.

(Maven has some detailed failure information, whereas SBT doesn't have 
anything).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22186: [SPARK-25183][SQL][WIP] Spark HiveServer2 to use Spark S...

2018-08-22 Thread steveloughran
Github user steveloughran commented on the issue:

https://github.com/apache/spark/pull/22186
  
Not sure what is up with the build here; worked with mvn locally. Possibly 
my use of a java 8 lamda-expression as the hook?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org