GeorgeJahad opened a new pull request, #3915:
URL: https://github.com/apache/ozone/pull/3915

    ## What changes were proposed in this pull request?
   
   Hadoop distributes two versions of their hdfs client:
   
   hadoop-client-api-3.3.1.jar and hadoop-common-3.3.4.jar
   
   The first uses shaded protobufs, eg: 
org.apache.hadoop.shaded.com.google.protobuf.Message
   
   The second unshaded protobufs, eg: com.google.protobuf.Message
   
   
   Currently, Ozone only supports unshaded protobufs, (with 
ozone-filesystem-hadoop3-1.3.0-SNAPSHOT.jar)
   
   But projects like spark use shaded protobufs, (through 
hadoop-client-api-3.3.1.jar).
   
   This PR adds the ozone-filesystem-hadoop-client-1.3.0-SNAPSHOT.jar which is 
identical to the ozone-filesystem-hadoop3-1.3.0-SNAPSHOT.jar, except that it 
uses the shaded protobufs, (so as to work with spark and other systems 
distributed with the hadoop-client jars.
   
   ## What is the link to the Apache JIRA
   https://issues.apache.org/jira/browse/HDDS-6926
   
   ## How was this patch tested?
   
   
   I installed spark and confirmed reading ozone keys with the new jar, (using 
the instructions below).  Note that the same instructions with the unshaded jar 
file, ozone-filesystem-hadoop3-1.3.0-SNAPSHOT.jar, will cause the cast 
exception reported in the jira ticket.
   
   ```
   # install spark
   <Download 
https://archive.apache.org/dist/spark/spark-3.2.1/spark-3.2.1-bin-hadoop3.2.tgz>
   cd $OZONE_ROOT/hadoop-ozone/dist/target/ozone-1.3.0-SNAPSHOT
   mkdir spark
   cd spark
   tar -xzf ~/Downloads/spark-3.2.1-bin-hadoop3.2.tgz
   
   # copy over the shaded jar file
   cp   
$OZONE_ROOT/hadoop-ozone/ozonefs-hadoop-client/target/ozone-filesystem-hadoop-client-1.3.0-SNAPSHOT.jar
 
$OZONE_ROOT/hadoop-ozone/dist/target/ozone-1.3.0-SNAPSHOT/spark/spark-3.2.1-bin-hadoop3.2/jars
   
   
   # start up docker cluster
   cd $OZONE_ROOT/hadoop-ozone/dist/target/ozone-1.3.0-SNAPSHOT/compose/ozone
   docker-compose up --no-recreate --scale datanode=3 -d
   
   # init docker cluster
   docker exec -it ozone_om_1 bash
   cd /opt/hadoop/spark/spark-3.2.1-bin-hadoop3.2/conf
   cp /etc/hadoop/ozone-site.xml .
   cd /opt/hadoop/spark/spark-3.2.1-bin-hadoop3.2/bin
   
   
   # init vol/bucket/key
   ozone sh volume create testgbj2
   ozone sh bucket create testgbj2/bucket1
   echo k1 > k1.orig
   ozone sh key put testgbj2/bucket1/k1 k1.orig
   
   # read ozone from spark
   ./spark-shell
   sc.setLogLevel("DEBUG")
   spark.read.text("ofs://om/testgbj2/bucket1/k1").show()
   
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to