GEOSPARK ERROR PYTHON

2021-02-12 Thread Ramon Barros
I'm using python 3.6 and spark 2.4, I added the correct jars in spark, some 
functions are being executed correctly, while others present this error as the 
image below. Can anyone help?

ERROR EXECUTE POINTRDD:

When I try to execute this function from PointRDD conditioning these two 
parameters
point_rdd = PointRDD(sc, input_location, splitter, carry_other_attributes, 
level, s_epsg, t_epsg)

Traceback (most recent call last):
File "", line 1, in 
File "/opt/anaconda3/lib/python3.6/site-packages/geospark/utils/meta.py", line 
122, in call
return method(args, **kwargs)
File 
"/opt/anaconda3/lib/python3.6/site-packages/geospark/core/SpatialRDD/point_rdd.py",
 line 381, in init
targetEpsgCode
File "/opt/anaconda3/lib/python3.6/site-packages/py4j/java_gateway.py", line 
1569, in call
answer, self._gateway_client, None, self._fqn)
File "/usr/local/spark/python/pyspark/sql/utils.py", line 63, in deco
return f(a, **kw)
File "/opt/anaconda3/lib/python3.6/site-packages/py4j/protocol.py", line 328, 
in get_return_value
format(target_id, ".", name), value)
py4j.protocol.Py4JJavaError: An error occurred while calling 
None.org.datasyslab.geospark.spatialRDD.PointRDD.
: java.net.ConnectException: Call From hadoopMaster/192.168.0.103 to 
hadoopMaster:9000 failed on connection exception: java.net.ConnectException: 
Connection refused; For more details see: 
http://wiki.apache.org/hadoop/ConnectionRefused
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:792)
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:732)
at org.apache.hadoop.ipc.Client.call(Client.java:1479)
at org.apache.hadoop.ipc.Client.call(Client.java:1412)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
at com.sun.proxy.$Proxy26.getFileInfo(Unknown Source)
at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:771)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:191)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy27.getFileInfo(Unknown Source)
at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:2108)
at 
org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1305)
at 
org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1301)
at 
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1317)
at org.apache.hadoop.fs.Globber.getFileStatus(Globber.java:57)
at org.apache.hadoop.fs.Globber.glob(Globber.java:252)
at org.apache.hadoop.fs.FileSystem.globStatus(FileSystem.java:1676)
at 
org.apache.hadoop.mapred.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:259)
at org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:229)
at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:315)
at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:204)
at org.apache.spark.rdd.RDD

KaTeX parse error: Can't use function '$' in math mode at position 8: 
anonfun$̲partitions$2.ap…: anonfun$partitions$2.apply(RDD.scala:273)
at org.apache.spark.rdd.RDD
anonfun$partitions$2.apply(RDD.scala:269)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:269)
at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.s

ramonbarrosk @ramonbarrosk fev 11 11:15
These instructions work normally, but this PointRDD has this error
from pyspark import StorageLevel
from geospark.core.SpatialRDD import PointRDD
from geospark.core.enums import FileDataSplitter

input_location = "checkin.csv"
offset = 0 # The point long/lat starts from Column 0
splitter = FileDataSplitter.CSV # FileDataSplitter enumeration
carry_other_attributes = True # Carry Column 2 (hotel, gas, bar...)
level = StorageLevel.MEMORY_ONLY # Storage level from pyspark
s_epsg = "epsg:4326" # Source epsg code
t_epsg = "epsg:5070" # target epsg code

point_rdd = PointRDD(sc, input_location, offset, splitter, 
carry_other_attributes)


Re: GEOSPARK ERROR PYTHON

2021-02-12 Thread Jia Yu
Hello,

1. Which Sedona version are you using? We strongly recommend the latest
Sedona 1.0.0.
2. Are you using the correct Sedona jars for Scala 2.11 and Spark 2.4?
3. Your screenshot does not the entire stack trace. I can only see
connection refused which is not root cause. Please show us the complete
screenshot that contains the root Exception.

Jia

On Fri, Feb 12, 2021 at 3:42 PM Ramon Barros  wrote:

> I'm using python 3.6 and spark 2.4, I added the correct jars in spark,
> some functions are being executed correctly, while others present this
> error as the image below. Can anyone help?
>


GEOSPARK ERROR PYTHON

2021-02-12 Thread Ramon Barros
I'm using python 3.6 and spark 2.4, I added the correct jars in spark, some
functions are being executed correctly, while others present this error as
the image below. Can anyone help?


Re: [DISCUSS] Put all GeoTools jars into a package on Maven Central

2021-02-12 Thread Felix Cheung
LGPL is cat X - it wouldn’t be something Apache Sedona should distribute or
depend on
https://www.apache.org/legal/resolved.html#optional


On Thu, Feb 11, 2021 at 11:59 PM Jia Yu  wrote:

> OSGeos LocationTech owns GeoTools. I am thinking whether I should have my
> wrapper on Maven Central to bring those Sedona required GeoTools jars to
> Maven Central. Since it is LGPL, it might be OK to do so.
>
> On Thu, Feb 11, 2021 at 5:18 PM Felix Cheung 
> wrote:
>
> > Who owns or manages GeoTools if it is LGPL?
> >
> > On Thu, Feb 11, 2021 at 12:01 PM Jia Yu  wrote:
> >
> >> Pawel,
> >>
> >> Python-adapter module is always being used by users. But it does not
> come
> >> with GeoTools. To use it, users have to (1) compile the source code of
> >> Python-adapter, or (2) add GeoTools coordiantes from OSGEO repo via
> >> config(""), or (3) download and copy GeoTools jars to SPARK_HOME/jars/
> >>
> >> The easiest is 2, but it looks like it may not work in all environments
> >> since it needs to search OSGEO repo.
> >>
> >> What I am saying is that we can "move" GeoTools jars to Maven Central,
> >> Method 2 will 100% work, users just need to add
> >> "sedona-python-adapter-1.0.0-incubating" and "geotools-24-wrapper-1.0.0"
> >> coordinates in code.
> >>
> >> Do you think this is necessary?
> >>
> >> On Thu, Feb 11, 2021 at 11:40 AM Paweł Kociński <
> >> pawel93kocin...@gmail.com> wrote:
> >>
> >>> Both options seems good to me, but we have to remember that not all of
> >>> Sedona users using cloud solutions, some of them are using Spark with
> >>> hadoop. What about python-adapter module within sedona project, am I
> >>> missing sth ?
> >>> Regards,
> >>> Paweł
> >>>
> >>> czw., 11 lut 2021 o 14:40 Netanel Malka 
> >>> napisał(a):
> >>>
>  I think that we can make it work on Databricks without any changes.
>  After creating a cluster on Databricks, the user can install the
>  geotools packages and provide the osego *(or any other repo)
>  explicitly.*
> 
>  As you can see in the picture:
> 
>  [image: image.png]
>  I can provide the details on how to install it.
> 
>  I think it will solve the problem.
>  What do you think?
> 
> 
>  On Thu, 11 Feb 2021 at 12:24, Jia Yu  wrote:
> 
> > Hi folks,
> >
> > As you can see from the recent discussion in the mailing list
> > <[Bug][Python] Missing Java class>, in Sedona 1.0.0, because those
> LGPL
> > GeoTools jars are not on Maven Central (only in OSGEO repo),
> Databricks
> > cannot get GeoTools jars.
> >
> > I believe this will cause lots of trouble to our future Python users.
> > Reading Shapefiles and do CRS transformation are big selling points
> for
> > Sedona.
> >
> > The easiest way to fix this, without violating ASF policy, is that I
> > will publish a GeoTools wrapper on Maven Central using the old
> GeoSpark
> > group ID: https://mvnrepository.com/artifact/org.datasyslab
> >
> > For example, org.datasyslab:geotools-24-wrapper:1.0.0
> >
> > 1. This GeoTools wrapper does nothing but brings the GeoTools jars
> > needed by Sedona to Maven Central.
> > 2. When the Python user calls Sedona, they can add one more
> > package: org.datasyslab:geotools-24-wrapper:1.0.0
> >
> > Another good thing is that: this does not require a new source code
> > release from Sedona. We only need to update the website and let the
> users
> > know how to call it.
> >
> > Any better ideas?
> >
> > Thanks,
> > Jia
> >
> >
> >
> 
>  --
>  Best regards,
>  Netanel Malka.
> 
> >>>
>