Thank you for the quick reply!

It seems my particular situation is a bit more complex than that, since I'm
running the notebook on a Databricks cluster, and the default spark config
doesn't seem to allow for more jar repositories (GeoTools isn't on Maven
Central), nor does creating a new SparkSession appears to work. I've tried
to download the jars and add them manually to the cluster but it doesn't
seem to work either. But at least I know where the issue's at!

Thanks again for your help,
Regards

On Wed, 10 Feb 2021 at 12:22, Jia Yu <ji...@apache.org> wrote:

> Hi Gregory,
>
> Thanks for letting us know. This is not a bug. We cannot include GeoTools
> jars due to license issues. But indeed we forgot to update the docs and
> jupyter notebook examples. I just updated them. Please read them here:
>
>
> https://github.com/apache/incubator-sedona/blob/master/python/ApacheSedonaSQL.ipynb
>
> (Make sure you disable the browser cache or open it in an incognito
> window)  http://sedona.apache.org/download/overview/#install-sedona-python
>
> In short, you need to add the following coordinates in the notebook:
>
> spark = SparkSession. \ builder. \ appName('appName'). \ config(
> "spark.serializer", KryoSerializer.getName). \ config(
> "spark.kryo.registrator", SedonaKryoRegistrator.getName). \ config(
> "spark.jars.repositories", 'https://repo.osgeo.org/repository/release,' '
> https://download.java.net/maven/2'). \ config('spark.jars.packages',
> 'org.apache.sedona:sedona-python-adapter-3.0_2.12:1.0.0-incubating,'
> 'org.geotools:gt-main:24.0,' 'org.geotools:gt-referencing:24.0,'
> 'org.geotools:gt-epsg-hsql:24.0'). \ getOrCreate()
>
> On Wed, Feb 10, 2021 at 2:35 AM Grégory Dugernier <g...@aloalto.com> wrote:
>
>> Hello,
>>
>> I've been trying to run Sedona for Python on Databricks for 2 days and I
>> think I've stumbled upon a bug.
>>
>> *Configuration*:
>>
>>    - Spark 3.0.1
>>    - Scala 2.12
>>    - Python 3.7
>>
>> *Librairies*:
>>
>>    - apache-sedona (from PyPi)
>>    - org.apache.sedona:sedona-python-adapter-3.0_2.12:1.0.0-incubating
>>    (from Maven)
>>
>> *What I'm trying to do:*
>>
>> I'm trying to load a series of Shapefiles files into a dataframe for
>> geospatial analysis. See code snippet below, based of your example
>> notebook
>> <
>> https://github.com/apache/incubator-sedona/blob/master/python/ApacheSedonaCore.ipynb
>> >
>>
>>
>> > from sedona.core.formatMapper.shapefileParser import ShapefileReader
>> > from sedona.register import SedonaRegistrator
>> > from sedona.utils.adapter import Adapter
>> >
>> > SedonaRegistrator.registerAll(spark)
>> > shape_rdd = ShapefileReader.readToGeometryRDD(spark.sparkContext,
>> > file_name)
>> > df = Adapter.toDf(shape_rdd, spark)
>> >
>>
>> *Bug*:
>>
>> The ShapefileReader.readToGeometryRDD() currently throws the following
>> error:
>>
>> > Py4JJavaError: An error occurred while calling
>> >
>> z:org.apache.sedona.core.formatMapper.shapefileParser.ShapefileReader.readToGeometryRDD.
>> > : java.lang.NoClassDefFoundError:
>> org/opengis/referencing/FactoryException
>> > at
>> >
>> org.apache.sedona.core.formatMapper.shapefileParser.ShapefileReader.readToGeometryRDD(ShapefileReader.java:79)
>> > at
>> >
>> org.apache.sedona.core.formatMapper.shapefileParser.ShapefileReader.readToGeometryRDD(ShapefileReader.java:66)
>> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at
>> >
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>> > at
>> >
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>> > at java.lang.reflect.Method.invoke(Method.java:498) at
>> > py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244) at
>> > py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:380) at
>> > py4j.Gateway.invoke(Gateway.java:295) at
>> > py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132) at
>> > py4j.commands.CallCommand.execute(CallCommand.java:79) at
>> > py4j.GatewayConnection.run(GatewayConnection.java:251) at
>> > java.lang.Thread.run(Thread.java:748) Caused by:
>> > java.lang.ClassNotFoundException:
>> org.opengis.referencing.FactoryException
>> > at java.net.URLClassLoader.findClass(URLClassLoader.java:382) at
>> > java.lang.ClassLoader.loadClass(ClassLoader.java:419) at
>> >
>> com.databricks.backend.daemon.driver.ClassLoaders$LibraryClassLoader.loadClass(ClassLoaders.scala:151)
>> > at java.lang.ClassLoader.loadClass(ClassLoader.java:352)
>> > : java.lang.NoClassDefFoundError:
>> org/opengis/referencing/FactoryException
>> > at
>> >
>> org.apache.sedona.core.formatMapper.shapefileParser.ShapefileReader.readToGeometryRDD(ShapefileReader.java:79)
>> > at
>> >
>> org.apache.sedona.core.formatMapper.shapefileParser.ShapefileReader.readToGeometryRDD(ShapefileReader.java:66)
>> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> > at
>> >
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>> > at
>> >
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>> > at java.lang.reflect.Method.invoke(Method.java:498)
>> > at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
>> > at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:380)
>> > at py4j.Gateway.invoke(Gateway.java:295)
>> > at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
>> > at py4j.commands.CallCommand.execute(CallCommand.java:79)
>> > at py4j.GatewayConnection.run(GatewayConnection.java:251)
>> > at java.lang.Thread.run(Thread.java:748)
>> > Caused by: java.lang.ClassNotFoundException:
>> > org.opengis.referencing.FactoryException
>> > at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
>> > at java.lang.ClassLoader.loadClass(ClassLoader.java:419)
>> > at
>> >
>> com.databricks.backend.daemon.driver.ClassLoaders$LibraryClassLoader.loadClass(ClassLoaders.scala:151)
>> > at java.lang.ClassLoader.loadClass(ClassLoader.java:352)
>>
>>
>> Adding the org.apache.sedona:sedona-core-3.0_2.12:1.0.0-incubating library
>> from Maven doesn't solve the error. Adding the
>> org.datasyslab:geospark:1.3.1
>> library from Maven solves the error, but it creates conflicts with the
>> underlying org.locationtech.jts dependencies. This makes me think there is
>> a missing OpenGIS dependency in the sedona-python-adapter.
>>
>> Regards,
>> G. Dugernier
>>
>> --
>>
>>
>>
>> Grégory Dugernier
>> Software Engineer
>>
>> g...@aloalto.com <f...@aloalto.com>
>> +32 (0)484 11 26 09
>>
>> www.aloalto.com
>> +32 (0)2 736 10 17
>>
>> --
>>
>>
>>
>>
>> DISCLAIMER : The content of this e-mail
>> message does not constitute a
>> commitment of S.A. ALOALTO N.V. or its
>> subsidiaries/affiliates. This e-mail
>> and any attachments thereto may contain
>> information which is confidential
>> and/or protected by intellectual property
>> rights and are intended for the
>> intended recipient only. Any use of the
>> information contained herein
>> (including, but not limited to, total or partial
>> reproduction,
>> communication or distribution in any form) by persons other than
>> the
>> designated recipient(s) is prohibited. If an addressing or transmission
>> error has misdirected this e-mail, please notify the author, either by
>> telephone or by e-mail and delete the material from any computer.
>>
>>

-- 



Grégory Dugernier
Software Engineer

g...@aloalto.com <f...@aloalto.com>
+32 (0)484 11 26 09

www.aloalto.com
+32 (0)2 736 10 17

-- 




DISCLAIMER : The content of this e-mail
message does not constitute a 
commitment of S.A. ALOALTO N.V. or its
subsidiaries/affiliates. This e-mail 
and any attachments thereto may contain
information which is confidential 
and/or protected by intellectual property
rights and are intended for the 
intended recipient only. Any use of the
information contained herein 
(including, but not limited to, total or partial
reproduction, 
communication or distribution in any form) by persons other than
the 
designated recipient(s) is prohibited. If an addressing or transmission
error has misdirected this e-mail, please notify the author, either by
telephone or by e-mail and delete the material from any computer. 

Reply via email to