Jia, thank you for your helpful reply.

With regards to the “In short, I copy all jars into /usr/lib/spark/jars.” – 
could you please let me know where exactly this file directory should be? We’ve 
been trying to place this jars in an S3
Bucket and then copy them over within the file system of one of the EMR nodes 
on launch. We think this is what is causing us the error.

Many thanks for your help,

Mehmet

From: Jia Yu <ji...@apache.org>
Sent: 29 July 2022 04:03
To: dev@sedona.apache.org; Mehmet Kalich <m.kal...@addland.com>
Subject: Re: Sedona on AWS EMR Cluster

OK. I saw the image now.

Here is a user comment from Sedona Gitter


Hi. I use Sedona on EMR 6.3 without issue. In short, I copy all jars into 
/usr/lib/spark/jars. The jars include spark_spark-avro_2.12.2.4.4.jar, 
sedona-python-adapter-3.0_2.12-1.0.1-incubating.jar, sedona-core, sedona-sql, 
and geotools-wrapper. I also setup a python virtualenv and pip install all 
dependencies there on all nodes. The last part is to set an EMR Configuration 
on the core instance group: Classification: spark-env.export, PYSPARK_PYTHON, 
/home/hadoop/venv/bin/python. That ensures that your spark-submitted jobs use 
the virtualenv you've created (named venv in this case).

Sedona is configured as a sql extension, so to use it in your spark-submitted 
app, include --conf 
spark.sql.extensions="org.apache.sedona.sql.SedonaSqlExtensions". I don't think 
I did anything else to make it available to submitted apps or to Zeppelin 
notebooks. It just works.


One more thing. I'm still using Sedona 1.0.1. To use shapefiles I had to keep 
them zipped to load them. The zip file included the .shp, .shx, .dbf, files.

On Thu, Jul 28, 2022 at 7:50 PM Jia Yu 
<ji...@apache.org<mailto:ji...@apache.org>> wrote:
Hi Mehmet,

The figure in your email is not visible. Can you copy it as text? Many Sedona 
users are using EMR. Sedona should work fine there.

Thanks,
Jia

On Tue, Jul 26, 2022 at 8:59 AM Mehmet Kalich 
<m.kal...@addland.com<mailto:m.kal...@addland.com>> wrote:
Dear Sedona team,

I work at a Geospatial research company in London, and we are trying to install 
Sedona on an AWS EMR Cluster.

The main issue is that when we add the jars into EMR boostrap steps, we get 
this error:

As a result, the JAR files cannot be opened.

If you could please either write back with a link to articles/support with 
using Sedona in EMR, that would be greatly appreciated.

Best wishes,

Mehmet Kalich

Platform Engineer
Addland

Reply via email to