我在自己的Mac Pro上安装了单机版的hadoop, 使用的版本是官方hadoop3.3.6,系统用户名是 shuai.chen 
,是的,名称中间带有点号!以下实验过程都是以系统用户名 shuai.chen 进行的。


core-site.xml配置如下


<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
    <property>
        <name>fs.defaultFS</name>
        <value>hdfs://localhost:8020/</value>
    </property>
    <property>
        <name>hadoop.tmp.dir</name>
        <value>/Users/shuai.chen/dev/hadoop-3.3.6/hdfs/tmp</value>
    </property>
    <property>
        <name>hadoop.proxyuser.shuai.chen.hosts</name>
        <value>*</value>
    </property>
    <property>
        <name>hadoop.proxyuser.shuai.chen.groups</name>
        <value>*</value>
    </property>
</configuration>


安装hive 3.1.2并使用beeline通过hive账户访问正常,数据可以读写
hive-site.xml配置如下


<?xml version="1.0"?>
<configuration>
    <property>
        <name>hive.metastore.local</name>
        <value>false</value>
    </property>
    <property>
        <name>hive.metastore.uris</name>
        <value>thrift://localhost:9083</value>
    </property>
    <property>
        <name>javax.jdo.option.ConnectionURL</name>
        
<value>jdbc:mysql://localhost:3306/hive?createDatabaseIfNotExist=true&amp;useSSL=false</value>
    </property>
    <property>
        <name>hive.metastore.warehouse.dir</name>
        <value>/user/hive/warehouse</value>
    </property>
    <property>
        <name>javax.jdo.option.ConnectionDriverName</name>
        <value>com.mysql.cj.jdbc.Driver</value>
    </property>
    <property>
        <name>javax.jdo.option.ConnectionUserName</name>
        <value>hive</value>
    </property>
    <property>
        <name>javax.jdo.option.ConnectionPassword</name>
        <value>hive</value>
    </property>
    <property>
        <name>beeline.hs2.connection.user</name>
        <value>hive</value>
    </property>
    <property>
        <name>beeline.hs2.connection.password</name>
        <value>hive</value>
    </property>
    <property>
        <name>beeline.hs2.connection.hosts</name>
        <value>localhost:10000</value>
    </property>
    <property>
        <name>hive.aux.jars.path</name>
        <value>/Users/shuai.chen/dev/apache-hive-3.1.2-bin/auxlib</value>
    </property>
    <property>
        <name>hive.execution.engine</name>
        <value>mr</value>
    </property>
    <property>
        <name>hive.server2.thrift.bind.host</name>
        <value>localhost</value>
    </property>
    <property>
        <name>hive.server2.thrift.port</name>
        <value>10000</value>
    </property>
    <property>
        <name>hive.metastore.event.db.notification.api.auth</name>
        <value>false</value>
    </property>
    <property>
        <name>hive.cli.print.header</name>
        <value>true</value>
    </property>
    <property>
        <name>hive.cli.print.current.db</name>
        <value>true</value>
    </property>
    <property>
        <name>hive.server2.webui.host</name>
        <value>localhost</value>
    </property>
    <property>
        <name>hive.server2.webui.port</name>
        <value>10002</value>
    </property>
</configuration>


0: jdbc:hive2://localhost:10000> select * from student;


INFO  : Compiling 
command(queryId=shuai.chen_20241025121808_1285a60c-aef9-42da-a839-347e30586aa6):
 select * from student


INFO  : Concurrency mode is disabled, not creating a lock manager


INFO  : Semantic Analysis Completed (retrial = false)


INFO  : Returning Hive schema: 
Schema(fieldSchemas:[FieldSchema(name:student.id, type:int, comment:null), 
FieldSchema(name:student.name, type:string, comment:null)], properties:null)


INFO  : Completed compiling 
command(queryId=shuai.chen_20241025121808_1285a60c-aef9-42da-a839-347e30586aa6);
 Time taken: 2.692 seconds


INFO  : Concurrency mode is disabled, not creating a lock manager


INFO  : Executing 
command(queryId=shuai.chen_20241025121808_1285a60c-aef9-42da-a839-347e30586aa6):
 select * from student


INFO  : Completed executing 
command(queryId=shuai.chen_20241025121808_1285a60c-aef9-42da-a839-347e30586aa6);
 Time taken: 0.006 seconds


INFO  : OK


INFO  : Concurrency mode is disabled, not creating a lock manager


+-------------+---------------+


| student.id  | student.name  |


+-------------+---------------+


| 1           | Jack          |


| 2           | Rose          |


+-------------+---------------+


2 rows selected (3.162 seconds)




安装spark 3.3.1 with hadoop 3也能够使用spark sql访问hive表


24/10/25 12:18:57 WARN HiveConf: HiveConf of name hive.metastore.local does not 
exist
24/10/25 12:18:57 WARN HiveConf: HiveConf of name 
hive.metastore.event.db.notification.api.auth does not exist
Spark master: local[*], Application Id: local-1729829938805
spark-sql (default)> select * from student;
24/10/25 12:19:11 WARN SessionState: METASTORE_FILTER_HOOK will be ignored, 
since hive.security.authorization.manager is set to instance of 
HiveAuthorizerFactory.
idname
1Jack
2Rose
Time taken: 2.274 seconds, Fetched 2 row(s)




接下来通过kyuubi 1.9.2访问hive表时报了如下错误:
bin/beeline -u 'jdbc:hive2://localhost:10009/' -n apache


Connecting to jdbc:hive2://localhost:10009/


2024-10-25 12:28:49.658 INFO KyuubiSessionManager-exec-pool: Thread-63 
org.apache.kyuubi.operation.LaunchEngine: Processing apache's 
query[3961940e-7d87-46a6-a2c8-edd3677b5d96]: PENDING_STATE -> RUNNING_STATE, 
statement:


LaunchEngine


2024-10-25 12:28:49.661 INFO KyuubiSessionManager-exec-pool: Thread-63 
org.apache.kyuubi.shaded.curator.framework.imps.CuratorFrameworkImpl: Starting


2024-10-25 12:28:49.661 INFO KyuubiSessionManager-exec-pool: Thread-63 
org.apache.kyuubi.shaded.zookeeper.ZooKeeper: Initiating client connection, 
connectString=localhost:2181 sessionTimeout=60000 
watcher=org.apache.kyuubi.shaded.curator.ConnectionState@8b80d8c


2024-10-25 12:28:49.664 INFO KyuubiSessionManager-exec-pool: 
Thread-63-SendThread(localhost:2181) 
org.apache.kyuubi.shaded.zookeeper.ClientCnxn: Opening socket connection to 
server localhost/0:0:0:0:0:0:0:1:2181. Will not attempt to authenticate using 
SASL (unknown error)


2024-10-25 12:28:49.665 INFO KyuubiSessionManager-exec-pool: 
Thread-63-SendThread(localhost:2181) 
org.apache.kyuubi.shaded.zookeeper.ClientCnxn: Socket connection established to 
localhost/0:0:0:0:0:0:0:1:2181, initiating session


2024-10-25 12:28:49.667 INFO KyuubiSessionManager-exec-pool: 
Thread-63-SendThread(localhost:2181) 
org.apache.kyuubi.shaded.zookeeper.ClientCnxn: Session establishment complete 
on server localhost/0:0:0:0:0:0:0:1:2181, sessionid = 0x10009bf7b340013, 
negotiated timeout = 40000


2024-10-25 12:28:49.667 INFO KyuubiSessionManager-exec-pool: 
Thread-63-EventThread 
org.apache.kyuubi.shaded.curator.framework.state.ConnectionStateManager: State 
change: CONNECTED


2024-10-25 12:28:49.684 INFO KyuubiSessionManager-exec-pool: Thread-63 
org.apache.kyuubi.engine.ProcBuilder: Logging to 
/Users/shuai.chen/dev/apache-kyuubi-1.9.2-bin/work/apache/kyuubi-spark-sql-engine.log.5


2024-10-25 12:28:49.685 INFO KyuubiSessionManager-exec-pool: Thread-63 
org.apache.kyuubi.engine.EngineRef: Launching engine:


/Users/shuai.chen/dev/spark-3.3.1-bin-hadoop3/bin/spark-submit \


--class org.apache.kyuubi.engine.spark.SparkSQLEngine \


--conf spark.hive.server2.thrift.resultset.default.fetch.size=1000 \


--conf spark.kyuubi.client.ipAddress=127.0.0.1 \


--conf spark.kyuubi.client.version=1.9.2 \


--conf 
spark.kyuubi.engine.engineLog.path=/Users/shuai.chen/dev/apache-kyuubi-1.9.2-bin/work/apache/kyuubi-spark-sql-engine.log.5
 \


--conf spark.kyuubi.engine.share.level=USER \


--conf spark.kyuubi.engine.submit.time=1729830529677 \


--conf spark.kyuubi.engine.type=SPARK_SQL \


--conf spark.kyuubi.frontend.protocols=THRIFT_BINARY,REST \


--conf spark.kyuubi.ha.addresses=localhost:2181 \


--conf spark.kyuubi.ha.engine.ref.id=6498a13e-ca86-4f7b-9515-b9b59d19a6dd \


--conf spark.kyuubi.ha.namespace=/kyuubi_1.9.2_USER_SPARK_SQL/apache/default \


--conf spark.kyuubi.server.ipAddress=127.0.0.1 \


--conf spark.kyuubi.session.connection.url=localhost:10009 \


--conf spark.kyuubi.session.engine.initialize.timeout=PT3M \


--conf spark.kyuubi.session.real.user=apache \


--conf 
spark.app.name=kyuubi_USER_SPARK_SQL_apache_default_6498a13e-ca86-4f7b-9515-b9b59d19a6dd
 \


--conf spark.master=yarn \


--conf spark.submit.deployMode=cluster \


--conf spark.yarn.maxAppAttempts=1 \


--conf spark.yarn.tags=KYUUBI,6498a13e-ca86-4f7b-9515-b9b59d19a6dd \


--proxy-user apache 
/Users/shuai.chen/dev/apache-kyuubi-1.9.2-bin/externals/engines/spark/kyuubi-spark-sql-engine_2.12-1.9.2.jar


2024-10-25 12:28:53.735 INFO Curator-Framework-0 
org.apache.kyuubi.shaded.curator.framework.imps.CuratorFrameworkImpl: 
backgroundOperationsLoop exiting


2024-10-25 12:28:53.737 INFO KyuubiSessionManager-exec-pool: Thread-63 
org.apache.kyuubi.shaded.zookeeper.ZooKeeper: Session: 0x10009bf7b340013 closed


2024-10-25 12:28:53.737 INFO KyuubiSessionManager-exec-pool: 
Thread-63-EventThread org.apache.kyuubi.shaded.zookeeper.ClientCnxn: 
EventThread shut down for session: 0x10009bf7b340013


2024-10-25 12:28:53.738 INFO KyuubiSessionManager-exec-pool: Thread-63 
org.apache.kyuubi.operation.LaunchEngine: Processing apache's 
query[3961940e-7d87-46a6-a2c8-edd3677b5d96]: RUNNING_STATE -> ERROR_STATE, time 
taken: 4.079 seconds


SLF4J: Class path contains multiple SLF4J bindings.


SLF4J: Found binding in 
[jar:file:/Users/shuai.chen/dev/spark-3.3.1-bin-hadoop3/jars/log4j-slf4j-impl-2.17.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]


SLF4J: Found binding in 
[jar:file:/Users/shuai.chen/dev/hadoop-3.3.6/share/hadoop/common/lib/slf4j-reload4j-1.7.36.jar!/org/slf4j/impl/StaticLoggerBinder.class]


SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.


SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]


24/10/25 12:28:51 WARN Utils: Your hostname, shuaichendeMacBook-Pro.local 
resolves to a loopback address: 127.0.0.1; using 172.31.21.68 instead (on 
interface en0)


24/10/25 12:28:51 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another 
address


24/10/25 12:28:52 WARN NativeCodeLoader: Unable to load native-hadoop library 
for your platform... using builtin-java classes where applicable


24/10/25 12:28:52 INFO DefaultNoHARMFailoverProxyProvider: Connecting to 
ResourceManager at localhost/127.0.0.1:8032


Exception in thread "main" org.apache.spark.SparkException: ERROR: 
org.apache.hadoop.security.authorize.AuthorizationException: User: shuai.chen 
is not allowed to impersonate apache


at org.apache.spark.deploy.SparkSubmit.error(SparkSubmit.scala:975)


at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:174)


at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)


at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)


at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1046)


at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1055)


at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)


24/10/25 12:28:52 INFO ShutdownHookManager: Shutdown hook called


24/10/25 12:28:52 INFO ShutdownHookManager: Deleting directory 
/private/var/folders/t9/q0g6dydj28ncjkn18rfx_mhc0000gn/T/spark-8025071a-0f2c-4377-bf68-a7b92aee08b9


Error: org.apache.kyuubi.KyuubiSQLException: 
org.apache.kyuubi.KyuubiSQLException: Exception in thread "main" 
org.apache.spark.SparkException: ERROR: 
org.apache.hadoop.security.authorize.AuthorizationException: User: shuai.chen 
is not allowed to impersonate apache


at org.apache.spark.deploy.SparkSubmit.error(SparkSubmit.scala:975)


at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:174)


at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)


at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)


at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1046)


at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1055)


at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)


 See more: 
/Users/shuai.chen/dev/apache-kyuubi-1.9.2-bin/work/apache/kyuubi-spark-sql-engine.log.5


at org.apache.kyuubi.KyuubiSQLException$.apply(KyuubiSQLException.scala:69)


at org.apache.kyuubi.engine.ProcBuilder.$anonfun$start$1(ProcBuilder.scala:234)


at java.lang.Thread.run(Thread.java:750)




我查到Hadoop社区已经在3.2.0版本修复了该问题,链接如下
https://issues.apache.org/jira/browse/HADOOP-15395


是我哪里没有配置对么?


附kyuubi-defaults.conf
```
kyuubi.authentication                    NONE


kyuubi.frontend.bind.host                localhost
kyuubi.frontend.protocols                THRIFT_BINARY,REST
kyuubi.frontend.thrift.binary.bind.port  10009
kyuubi.frontend.rest.bind.port           10099


kyuubi.engine.type                       SPARK_SQL
#kyuubi.engine.type=FLINK_SQL
#kyuubi.engine.type=TRINO
#kyuubi.session.engine.trino.connection.url=http://localhost:18080
#kyuubi.session.engine.trino.connection.catalog=hive


kyuubi.engine.share.level                USER
kyuubi.session.engine.initialize.timeout PT3M


#kyuubi.ha.addresses                      localhost:2181
#kyuubi.ha.namespace                    kyuubi


spark.master=yarn
spark.submit.deployMode=cluster


# Details in https://kyuubi.readthedocs.io/en/master/configuration/settings.html
```


和kyuubi-env.sh


export JAVA_HOME=/Users/shuai.chen/.sdkman/candidates/java/8.0.422-zulu
export SPARK_HOME=/Users/shuai.chen/dev/spark-3.3.1-bin-hadoop3
export FLINK_HOME=/Users/shuai.chen/dev/flink
export FLINK_ENGINE_HOME=/Users/shuai.chen/dev/flink
export TRINO_HOME=/Users/shuai.chen/dev/trino-server-427
export TRINO_ENGINE_HOME=/Users/shuai.chen/dev/trino-server-427
export HADOOP_HOME=/Users/shuai.chen/dev/hadoop-3.3.6
export HADOOP_CONF_DIR=/Users/shuai.chen/dev/hadoop-3.3.6/etc/hadoop
export SPARK_DIST_CLASSPATH=$(/Users/shuai.chen/dev/hadoop-3.3.6/bin/hadoop 
classpath)

Reply via email to