GeorgeJahad opened a new pull request, #3915:
URL: https://github.com/apache/ozone/pull/3915
## What changes were proposed in this pull request?
Hadoop distributes two versions of their hdfs client:
hadoop-client-api-3.3.1.jar and hadoop-common-3.3.4.jar
The first uses shaded protobufs, eg:
org.apache.hadoop.shaded.com.google.protobuf.Message
The second unshaded protobufs, eg: com.google.protobuf.Message
Currently, Ozone only supports unshaded protobufs, (with
ozone-filesystem-hadoop3-1.3.0-SNAPSHOT.jar)
But projects like spark use shaded protobufs, (through
hadoop-client-api-3.3.1.jar).
This PR adds the ozone-filesystem-hadoop-client-1.3.0-SNAPSHOT.jar which is
identical to the ozone-filesystem-hadoop3-1.3.0-SNAPSHOT.jar, except that it
uses the shaded protobufs, (so as to work with spark and other systems
distributed with the hadoop-client jars.
## What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-6926
## How was this patch tested?
I installed spark and confirmed reading ozone keys with the new jar, (using
the instructions below). Note that the same instructions with the unshaded jar
file, ozone-filesystem-hadoop3-1.3.0-SNAPSHOT.jar, will cause the cast
exception reported in the jira ticket.
```
# install spark
<Download
https://archive.apache.org/dist/spark/spark-3.2.1/spark-3.2.1-bin-hadoop3.2.tgz>
cd $OZONE_ROOT/hadoop-ozone/dist/target/ozone-1.3.0-SNAPSHOT
mkdir spark
cd spark
tar -xzf ~/Downloads/spark-3.2.1-bin-hadoop3.2.tgz
# copy over the shaded jar file
cp
$OZONE_ROOT/hadoop-ozone/ozonefs-hadoop-client/target/ozone-filesystem-hadoop-client-1.3.0-SNAPSHOT.jar
$OZONE_ROOT/hadoop-ozone/dist/target/ozone-1.3.0-SNAPSHOT/spark/spark-3.2.1-bin-hadoop3.2/jars
# start up docker cluster
cd $OZONE_ROOT/hadoop-ozone/dist/target/ozone-1.3.0-SNAPSHOT/compose/ozone
docker-compose up --no-recreate --scale datanode=3 -d
# init docker cluster
docker exec -it ozone_om_1 bash
cd /opt/hadoop/spark/spark-3.2.1-bin-hadoop3.2/conf
cp /etc/hadoop/ozone-site.xml .
cd /opt/hadoop/spark/spark-3.2.1-bin-hadoop3.2/bin
# init vol/bucket/key
ozone sh volume create testgbj2
ozone sh bucket create testgbj2/bucket1
echo k1 > k1.orig
ozone sh key put testgbj2/bucket1/k1 k1.orig
# read ozone from spark
./spark-shell
sc.setLogLevel("DEBUG")
spark.read.text("ofs://om/testgbj2/bucket1/k1").show()
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]