我在自己的Mac Pro上安装了单机版的hadoop, 使用的版本是官方hadoop3.3.6,系统用户名是 shuai.chen ,是的,名称中间带有点号!以下实验过程都是以系统用户名 shuai.chen 进行的。
core-site.xml配置如下 <?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <configuration> <property> <name>fs.defaultFS</name> <value>hdfs://localhost:8020/</value> </property> <property> <name>hadoop.tmp.dir</name> <value>/Users/shuai.chen/dev/hadoop-3.3.6/hdfs/tmp</value> </property> <property> <name>hadoop.proxyuser.shuai.chen.hosts</name> <value>*</value> </property> <property> <name>hadoop.proxyuser.shuai.chen.groups</name> <value>*</value> </property> </configuration> 安装hive 3.1.2并使用beeline通过hive账户访问正常,数据可以读写 hive-site.xml配置如下 <?xml version="1.0"?> <configuration> <property> <name>hive.metastore.local</name> <value>false</value> </property> <property> <name>hive.metastore.uris</name> <value>thrift://localhost:9083</value> </property> <property> <name>javax.jdo.option.ConnectionURL</name> <value>jdbc:mysql://localhost:3306/hive?createDatabaseIfNotExist=true&useSSL=false</value> </property> <property> <name>hive.metastore.warehouse.dir</name> <value>/user/hive/warehouse</value> </property> <property> <name>javax.jdo.option.ConnectionDriverName</name> <value>com.mysql.cj.jdbc.Driver</value> </property> <property> <name>javax.jdo.option.ConnectionUserName</name> <value>hive</value> </property> <property> <name>javax.jdo.option.ConnectionPassword</name> <value>hive</value> </property> <property> <name>beeline.hs2.connection.user</name> <value>hive</value> </property> <property> <name>beeline.hs2.connection.password</name> <value>hive</value> </property> <property> <name>beeline.hs2.connection.hosts</name> <value>localhost:10000</value> </property> <property> <name>hive.aux.jars.path</name> <value>/Users/shuai.chen/dev/apache-hive-3.1.2-bin/auxlib</value> </property> <property> <name>hive.execution.engine</name> <value>mr</value> </property> <property> <name>hive.server2.thrift.bind.host</name> <value>localhost</value> </property> <property> <name>hive.server2.thrift.port</name> <value>10000</value> </property> <property> <name>hive.metastore.event.db.notification.api.auth</name> <value>false</value> </property> <property> <name>hive.cli.print.header</name> <value>true</value> </property> <property> <name>hive.cli.print.current.db</name> <value>true</value> </property> <property> <name>hive.server2.webui.host</name> <value>localhost</value> </property> <property> <name>hive.server2.webui.port</name> <value>10002</value> </property> </configuration> 0: jdbc:hive2://localhost:10000> select * from student; INFO : Compiling command(queryId=shuai.chen_20241025121808_1285a60c-aef9-42da-a839-347e30586aa6): select * from student INFO : Concurrency mode is disabled, not creating a lock manager INFO : Semantic Analysis Completed (retrial = false) INFO : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:student.id, type:int, comment:null), FieldSchema(name:student.name, type:string, comment:null)], properties:null) INFO : Completed compiling command(queryId=shuai.chen_20241025121808_1285a60c-aef9-42da-a839-347e30586aa6); Time taken: 2.692 seconds INFO : Concurrency mode is disabled, not creating a lock manager INFO : Executing command(queryId=shuai.chen_20241025121808_1285a60c-aef9-42da-a839-347e30586aa6): select * from student INFO : Completed executing command(queryId=shuai.chen_20241025121808_1285a60c-aef9-42da-a839-347e30586aa6); Time taken: 0.006 seconds INFO : OK INFO : Concurrency mode is disabled, not creating a lock manager +-------------+---------------+ | student.id | student.name | +-------------+---------------+ | 1 | Jack | | 2 | Rose | +-------------+---------------+ 2 rows selected (3.162 seconds) 安装spark 3.3.1 with hadoop 3也能够使用spark sql访问hive表 24/10/25 12:18:57 WARN HiveConf: HiveConf of name hive.metastore.local does not exist 24/10/25 12:18:57 WARN HiveConf: HiveConf of name hive.metastore.event.db.notification.api.auth does not exist Spark master: local[*], Application Id: local-1729829938805 spark-sql (default)> select * from student; 24/10/25 12:19:11 WARN SessionState: METASTORE_FILTER_HOOK will be ignored, since hive.security.authorization.manager is set to instance of HiveAuthorizerFactory. idname 1Jack 2Rose Time taken: 2.274 seconds, Fetched 2 row(s) 接下来通过kyuubi 1.9.2访问hive表时报了如下错误: bin/beeline -u 'jdbc:hive2://localhost:10009/' -n apache Connecting to jdbc:hive2://localhost:10009/ 2024-10-25 12:28:49.658 INFO KyuubiSessionManager-exec-pool: Thread-63 org.apache.kyuubi.operation.LaunchEngine: Processing apache's query[3961940e-7d87-46a6-a2c8-edd3677b5d96]: PENDING_STATE -> RUNNING_STATE, statement: LaunchEngine 2024-10-25 12:28:49.661 INFO KyuubiSessionManager-exec-pool: Thread-63 org.apache.kyuubi.shaded.curator.framework.imps.CuratorFrameworkImpl: Starting 2024-10-25 12:28:49.661 INFO KyuubiSessionManager-exec-pool: Thread-63 org.apache.kyuubi.shaded.zookeeper.ZooKeeper: Initiating client connection, connectString=localhost:2181 sessionTimeout=60000 watcher=org.apache.kyuubi.shaded.curator.ConnectionState@8b80d8c 2024-10-25 12:28:49.664 INFO KyuubiSessionManager-exec-pool: Thread-63-SendThread(localhost:2181) org.apache.kyuubi.shaded.zookeeper.ClientCnxn: Opening socket connection to server localhost/0:0:0:0:0:0:0:1:2181. Will not attempt to authenticate using SASL (unknown error) 2024-10-25 12:28:49.665 INFO KyuubiSessionManager-exec-pool: Thread-63-SendThread(localhost:2181) org.apache.kyuubi.shaded.zookeeper.ClientCnxn: Socket connection established to localhost/0:0:0:0:0:0:0:1:2181, initiating session 2024-10-25 12:28:49.667 INFO KyuubiSessionManager-exec-pool: Thread-63-SendThread(localhost:2181) org.apache.kyuubi.shaded.zookeeper.ClientCnxn: Session establishment complete on server localhost/0:0:0:0:0:0:0:1:2181, sessionid = 0x10009bf7b340013, negotiated timeout = 40000 2024-10-25 12:28:49.667 INFO KyuubiSessionManager-exec-pool: Thread-63-EventThread org.apache.kyuubi.shaded.curator.framework.state.ConnectionStateManager: State change: CONNECTED 2024-10-25 12:28:49.684 INFO KyuubiSessionManager-exec-pool: Thread-63 org.apache.kyuubi.engine.ProcBuilder: Logging to /Users/shuai.chen/dev/apache-kyuubi-1.9.2-bin/work/apache/kyuubi-spark-sql-engine.log.5 2024-10-25 12:28:49.685 INFO KyuubiSessionManager-exec-pool: Thread-63 org.apache.kyuubi.engine.EngineRef: Launching engine: /Users/shuai.chen/dev/spark-3.3.1-bin-hadoop3/bin/spark-submit \ --class org.apache.kyuubi.engine.spark.SparkSQLEngine \ --conf spark.hive.server2.thrift.resultset.default.fetch.size=1000 \ --conf spark.kyuubi.client.ipAddress=127.0.0.1 \ --conf spark.kyuubi.client.version=1.9.2 \ --conf spark.kyuubi.engine.engineLog.path=/Users/shuai.chen/dev/apache-kyuubi-1.9.2-bin/work/apache/kyuubi-spark-sql-engine.log.5 \ --conf spark.kyuubi.engine.share.level=USER \ --conf spark.kyuubi.engine.submit.time=1729830529677 \ --conf spark.kyuubi.engine.type=SPARK_SQL \ --conf spark.kyuubi.frontend.protocols=THRIFT_BINARY,REST \ --conf spark.kyuubi.ha.addresses=localhost:2181 \ --conf spark.kyuubi.ha.engine.ref.id=6498a13e-ca86-4f7b-9515-b9b59d19a6dd \ --conf spark.kyuubi.ha.namespace=/kyuubi_1.9.2_USER_SPARK_SQL/apache/default \ --conf spark.kyuubi.server.ipAddress=127.0.0.1 \ --conf spark.kyuubi.session.connection.url=localhost:10009 \ --conf spark.kyuubi.session.engine.initialize.timeout=PT3M \ --conf spark.kyuubi.session.real.user=apache \ --conf spark.app.name=kyuubi_USER_SPARK_SQL_apache_default_6498a13e-ca86-4f7b-9515-b9b59d19a6dd \ --conf spark.master=yarn \ --conf spark.submit.deployMode=cluster \ --conf spark.yarn.maxAppAttempts=1 \ --conf spark.yarn.tags=KYUUBI,6498a13e-ca86-4f7b-9515-b9b59d19a6dd \ --proxy-user apache /Users/shuai.chen/dev/apache-kyuubi-1.9.2-bin/externals/engines/spark/kyuubi-spark-sql-engine_2.12-1.9.2.jar 2024-10-25 12:28:53.735 INFO Curator-Framework-0 org.apache.kyuubi.shaded.curator.framework.imps.CuratorFrameworkImpl: backgroundOperationsLoop exiting 2024-10-25 12:28:53.737 INFO KyuubiSessionManager-exec-pool: Thread-63 org.apache.kyuubi.shaded.zookeeper.ZooKeeper: Session: 0x10009bf7b340013 closed 2024-10-25 12:28:53.737 INFO KyuubiSessionManager-exec-pool: Thread-63-EventThread org.apache.kyuubi.shaded.zookeeper.ClientCnxn: EventThread shut down for session: 0x10009bf7b340013 2024-10-25 12:28:53.738 INFO KyuubiSessionManager-exec-pool: Thread-63 org.apache.kyuubi.operation.LaunchEngine: Processing apache's query[3961940e-7d87-46a6-a2c8-edd3677b5d96]: RUNNING_STATE -> ERROR_STATE, time taken: 4.079 seconds SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/Users/shuai.chen/dev/spark-3.3.1-bin-hadoop3/jars/log4j-slf4j-impl-2.17.2.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/Users/shuai.chen/dev/hadoop-3.3.6/share/hadoop/common/lib/slf4j-reload4j-1.7.36.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory] 24/10/25 12:28:51 WARN Utils: Your hostname, shuaichendeMacBook-Pro.local resolves to a loopback address: 127.0.0.1; using 172.31.21.68 instead (on interface en0) 24/10/25 12:28:51 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address 24/10/25 12:28:52 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 24/10/25 12:28:52 INFO DefaultNoHARMFailoverProxyProvider: Connecting to ResourceManager at localhost/127.0.0.1:8032 Exception in thread "main" org.apache.spark.SparkException: ERROR: org.apache.hadoop.security.authorize.AuthorizationException: User: shuai.chen is not allowed to impersonate apache at org.apache.spark.deploy.SparkSubmit.error(SparkSubmit.scala:975) at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:174) at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203) at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90) at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1046) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1055) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) 24/10/25 12:28:52 INFO ShutdownHookManager: Shutdown hook called 24/10/25 12:28:52 INFO ShutdownHookManager: Deleting directory /private/var/folders/t9/q0g6dydj28ncjkn18rfx_mhc0000gn/T/spark-8025071a-0f2c-4377-bf68-a7b92aee08b9 Error: org.apache.kyuubi.KyuubiSQLException: org.apache.kyuubi.KyuubiSQLException: Exception in thread "main" org.apache.spark.SparkException: ERROR: org.apache.hadoop.security.authorize.AuthorizationException: User: shuai.chen is not allowed to impersonate apache at org.apache.spark.deploy.SparkSubmit.error(SparkSubmit.scala:975) at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:174) at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203) at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90) at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1046) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1055) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) See more: /Users/shuai.chen/dev/apache-kyuubi-1.9.2-bin/work/apache/kyuubi-spark-sql-engine.log.5 at org.apache.kyuubi.KyuubiSQLException$.apply(KyuubiSQLException.scala:69) at org.apache.kyuubi.engine.ProcBuilder.$anonfun$start$1(ProcBuilder.scala:234) at java.lang.Thread.run(Thread.java:750) 我查到Hadoop社区已经在3.2.0版本修复了该问题,链接如下 https://issues.apache.org/jira/browse/HADOOP-15395 是我哪里没有配置对么? 附kyuubi-defaults.conf ``` kyuubi.authentication NONE kyuubi.frontend.bind.host localhost kyuubi.frontend.protocols THRIFT_BINARY,REST kyuubi.frontend.thrift.binary.bind.port 10009 kyuubi.frontend.rest.bind.port 10099 kyuubi.engine.type SPARK_SQL #kyuubi.engine.type=FLINK_SQL #kyuubi.engine.type=TRINO #kyuubi.session.engine.trino.connection.url=http://localhost:18080 #kyuubi.session.engine.trino.connection.catalog=hive kyuubi.engine.share.level USER kyuubi.session.engine.initialize.timeout PT3M #kyuubi.ha.addresses localhost:2181 #kyuubi.ha.namespace kyuubi spark.master=yarn spark.submit.deployMode=cluster # Details in https://kyuubi.readthedocs.io/en/master/configuration/settings.html ``` 和kyuubi-env.sh export JAVA_HOME=/Users/shuai.chen/.sdkman/candidates/java/8.0.422-zulu export SPARK_HOME=/Users/shuai.chen/dev/spark-3.3.1-bin-hadoop3 export FLINK_HOME=/Users/shuai.chen/dev/flink export FLINK_ENGINE_HOME=/Users/shuai.chen/dev/flink export TRINO_HOME=/Users/shuai.chen/dev/trino-server-427 export TRINO_ENGINE_HOME=/Users/shuai.chen/dev/trino-server-427 export HADOOP_HOME=/Users/shuai.chen/dev/hadoop-3.3.6 export HADOOP_CONF_DIR=/Users/shuai.chen/dev/hadoop-3.3.6/etc/hadoop export SPARK_DIST_CLASSPATH=$(/Users/shuai.chen/dev/hadoop-3.3.6/bin/hadoop classpath)