I believe that was fixed in 3.0 and there was a decision not to backport the fix: SPARK-31170 <https://issues.apache.org/jira/browse/SPARK-31170>
On Wed, Jun 3, 2020 at 1:04 PM Xiao Li <gatorsm...@gmail.com> wrote: > Just downloaded it in my local macbook. Trying to create a table using the > pre-built PySpark. It sounds like the conf "spark.sql.warehouse.dir" > does not take an effect. It is trying to create a directory in > "file:/user/hive/warehouse/t1". I have not done any investigation yet. Have > any of you hit the same issue? > > C02XT0U7JGH5:bin lixiao$ ./pyspark --conf > spark.sql.warehouse.dir="/Users/lixiao/Downloads/spark-2.4.6-bin-hadoop2.6" > > Python 2.7.16 (default, Jan 27 2020, 04:46:15) > > [GCC 4.2.1 Compatible Apple LLVM 10.0.1 (clang-1001.0.37.14)] on darwin > > Type "help", "copyright", "credits" or "license" for more information. > > 20/06/03 09:56:11 WARN NativeCodeLoader: Unable to load native-hadoop > library for your platform... using builtin-java classes where applicable > > Using Spark's default log4j profile: > org/apache/spark/log4j-defaults.properties > > Setting default log level to "WARN". > > To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use > setLogLevel(newLevel). > > Welcome to > > ____ __ > > / __/__ ___ _____/ /__ > > _\ \/ _ \/ _ `/ __/ '_/ > > /__ / .__/\_,_/_/ /_/\_\ version 2.4.6 > > /_/ > > > Using Python version 2.7.16 (default, Jan 27 2020 04:46:15) > > SparkSession available as 'spark'. > > >>> spark.sql("set spark.sql.warehouse.dir").show(truncate=False) > > +-----------------------+-------------------------------------------------+ > > |key |value > | > > +-----------------------+-------------------------------------------------+ > > |spark.sql.warehouse.dir|/Users/lixiao/Downloads/spark-2.4.6-bin-hadoop2.6| > > +-----------------------+-------------------------------------------------+ > > > >>> spark.sql("create table t1 (col1 int)") > > 20/06/03 09:56:29 WARN HiveMetaStore: Location: > file:/user/hive/warehouse/t1 specified for non-external table:t1 > > Traceback (most recent call last): > > File "<stdin>", line 1, in <module> > > File > "/Users/lixiao/Downloads/spark-2.4.6-bin-hadoop2.6/python/pyspark/sql/session.py", > line 767, in sql > > return DataFrame(self._jsparkSession.sql(sqlQuery), self._wrapped) > > File > "/Users/lixiao/Downloads/spark-2.4.6-bin-hadoop2.6/python/lib/py4j-0.10.7-src.zip/py4j/java_gateway.py", > line 1257, in __call__ > > File > "/Users/lixiao/Downloads/spark-2.4.6-bin-hadoop2.6/python/pyspark/sql/utils.py", > line 69, in deco > > raise AnalysisException(s.split(': ', 1)[1], stackTrace) > > pyspark.sql.utils.AnalysisException: > u'org.apache.hadoop.hive.ql.metadata.HiveException: > MetaException(message:file:/user/hive/warehouse/t1 is not a directory or > unable to create one);' > > Dongjoon Hyun <dongjoon.h...@gmail.com> 于2020年6月3日周三 上午9:18写道: > >> +1 >> >> Bests, >> Dongjoon >> >> On Wed, Jun 3, 2020 at 5:59 AM Tom Graves <tgraves...@yahoo.com.invalid> >> wrote: >> >>> +1 >>> >>> Tom >>> >>> On Sunday, May 31, 2020, 06:47:09 PM CDT, Holden Karau < >>> hol...@pigscanfly.ca> wrote: >>> >>> >>> Please vote on releasing the following candidate as Apache Spark >>> version 2.4.6. >>> >>> The vote is open until June 5th at 9AM PST and passes if a majority +1 >>> PMC votes are cast, with a minimum of 3 +1 votes. >>> >>> [ ] +1 Release this package as Apache Spark 2.4.6 >>> [ ] -1 Do not release this package because ... >>> >>> To learn more about Apache Spark, please see http://spark.apache.org/ >>> >>> There are currently no issues targeting 2.4.6 (try project = SPARK AND >>> "Target Version/s" = "2.4.6" AND status in (Open, Reopened, "In Progress")) >>> >>> The tag to be voted on is v2.4.6-rc8 (commit >>> 807e0a484d1de767d1f02bd8a622da6450bdf940): >>> https://github.com/apache/spark/tree/v2.4.6-rc8 >>> >>> The release files, including signatures, digests, etc. can be found at: >>> https://dist.apache.org/repos/dist/dev/spark/v2.4.6-rc8-bin/ >>> >>> Signatures used for Spark RCs can be found in this file: >>> https://dist.apache.org/repos/dist/dev/spark/KEYS >>> >>> The staging repository for this release can be found at: >>> https://repository.apache.org/content/repositories/orgapachespark-1349/ >>> >>> The documentation corresponding to this release can be found at: >>> https://dist.apache.org/repos/dist/dev/spark/v2.4.6-rc8-docs/ >>> >>> The list of bug fixes going into 2.4.6 can be found at the following URL: >>> https://issues.apache.org/jira/projects/SPARK/versions/12346781 >>> >>> This release is using the release script of the tag v2.4.6-rc8. >>> >>> FAQ >>> >>> ========================= >>> What happened to the other RCs? >>> ========================= >>> >>> The parallel maven build caused some flakiness so I wasn't comfortable >>> releasing them. I backported the fix from the 3.0 branch for this release. >>> I've got a proposed change to the build script so that we only push tags >>> when once the build is a success for the future, but it does not block this >>> release. >>> >>> ========================= >>> How can I help test this release? >>> ========================= >>> >>> If you are a Spark user, you can help us test this release by taking >>> an existing Spark workload and running on this release candidate, then >>> reporting any regressions. >>> >>> If you're working in PySpark you can set up a virtual env and install >>> the current RC and see if anything important breaks, in the Java/Scala >>> you can add the staging repository to your projects resolvers and test >>> with the RC (make sure to clean up the artifact cache before/after so >>> you don't end up building with an out of date RC going forward). >>> >>> =========================================== >>> What should happen to JIRA tickets still targeting 2.4.6? >>> =========================================== >>> >>> The current list of open tickets targeted at 2.4.6 can be found at: >>> https://issues.apache.org/jira/projects/SPARK and search for "Target >>> Version/s" = 2.4.6 >>> >>> Committers should look at those and triage. Extremely important bug >>> fixes, documentation, and API tweaks that impact compatibility should >>> be worked on immediately. Everything else please retarget to an >>> appropriate release. >>> >>> ================== >>> But my bug isn't fixed? >>> ================== >>> >>> In order to make timely releases, we will typically not hold the >>> release unless the bug in question is a regression from the previous >>> release. That being said, if there is something which is a regression >>> that has not been correctly targeted please ping me or a committer to >>> help target the issue. >>> >>> >>> -- >>> Twitter: https://twitter.com/holdenkarau >>> Books (Learning Spark, High Performance Spark, etc.): >>> https://amzn.to/2MaRAG9 <https://amzn.to/2MaRAG9> >>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau >>> >>