Hi, It looks like the error is your input format is not correct.
Are you sure in your input file, the geometry starts from offset 0? Jia On Fri, Feb 12, 2021 at 4:10 PM Ramon Barros <[email protected]> wrote: > I'm using python 3.6 and spark 2.4, I added the correct jars in spark, > some functions are being executed correctly, while others present this > error as the image below. Can anyone help? > > ERROR EXECUTE POINTRDD: > > When I try to execute this function from PointRDD conditioning these two > parameters > point_rdd = PointRDD(sc, input_location, splitter, carry_other_attributes, > level, s_epsg, t_epsg) > > Traceback (most recent call last): > File "<stdin>", line 1, in <module> > File "/opt/anaconda3/lib/python3.6/site-packages/geospark/utils/meta.py", > line 122, in call > return method(args, **kwargs) > File > "/opt/anaconda3/lib/python3.6/site-packages/geospark/core/SpatialRDD/point_rdd.py", > line 381, in init > targetEpsgCode > File "/opt/anaconda3/lib/python3.6/site-packages/py4j/java_gateway.py", > line 1569, in call > answer, self._gateway_client, None, self._fqn) > File "/usr/local/spark/python/pyspark/sql/utils.py", line 63, in deco > return f(a, **kw) > File "/opt/anaconda3/lib/python3.6/site-packages/py4j/protocol.py", line > 328, in get_return_value > format(target_id, ".", name), value) > py4j.protocol.Py4JJavaError: An error occurred while calling > None.org.datasyslab.geospark.spatialRDD.PointRDD. > : java.net.ConnectException: Call From hadoopMaster/192.168.0.103 to > hadoopMaster:9000 failed on connection exception: > java.net.ConnectException: Connection refused; For more details see: > http://wiki.apache.org/hadoop/ConnectionRefused > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:423) > at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:792) > at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:732) > at org.apache.hadoop.ipc.Client.call(Client.java:1479) > at org.apache.hadoop.ipc.Client.call(Client.java:1412) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229) > at com.sun.proxy.$Proxy26.getFileInfo(Unknown Source) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:771) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:191) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102) > at com.sun.proxy.$Proxy27.getFileInfo(Unknown Source) > at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:2108) > at > org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1305) > at > org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1301) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1317) > at org.apache.hadoop.fs.Globber.getFileStatus(Globber.java:57) > at org.apache.hadoop.fs.Globber.glob(Globber.java:252) > at org.apache.hadoop.fs.FileSystem.globStatus(FileSystem.java:1676) > at > org.apache.hadoop.mapred.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:259) > at > org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:229) > at > org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:315) > at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:204) > at org.apache.spark.rdd.RDD > > KaTeX parse error: Can't use function '$' in math mode at position 8: > anonfun$̲partitions$2.ap…: anonfun$partitions$2.apply(RDD.scala:273) > at org.apache.spark.rdd.RDD > anonfun$partitions$2.apply(RDD.scala:269) > at scala.Option.getOrElse(Option.scala:121) > at org.apache.spark.rdd.RDD.partitions(RDD.scala:269) > at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.s > > ramonbarrosk @ramonbarrosk fev 11 11:15 > These instructions work normally, but this PointRDD has this error > from pyspark import StorageLevel > from geospark.core.SpatialRDD import PointRDD > from geospark.core.enums import FileDataSplitter > > input_location = "checkin.csv" > offset = 0 # The point long/lat starts from Column 0 > splitter = FileDataSplitter.CSV # FileDataSplitter enumeration > carry_other_attributes = True # Carry Column 2 (hotel, gas, bar...) > level = StorageLevel.MEMORY_ONLY # Storage level from pyspark > s_epsg = "epsg:4326" # Source epsg code > t_epsg = "epsg:5070" # target epsg code > > point_rdd = PointRDD(sc, input_location, offset, splitter, > carry_other_attributes) >
