Hello all,
I have to process using Scala the data generated from molecular dynamics
simulations stored in NetCDF format. So I am trying to use Sedona to read
the data spread across multiple files. From the posts available on this
mailing list, the closest I found to my requirement was
https://www.mail-archive.com/[email protected]/msg00684.html. However I
am unsuccessful in using it for my case. Some other resources I tried
suggest org.apache.sedona.core.format.WKBFileReader which I could not find
in the current version where the closest available was
org.apache.sedona.core.formatMapper.WkbReader (which does not seem to be
useful).

I would be grateful if anyone can help with a pointer so that I can create
a single RDD from multiple netCDF files (say f1.nc, f2.nc,...). The same
was done with SciSpark which is not compatible with Spark 3. What I am
trying to accomplish is as below that can be used for further processing:

# Read NetCDF files from filelists in ncDirectoryPath
val ncFilesRDD: org.apache.spark.rdd.RDD[org.dia.core.SciTensor] =
sc.netcdfFileList(
                            "file://" + ncDirectoryPath, List("time",
"cell_lengths", "coordinates")
                        )

# Read data of interest
val ncFileCRDArrayRDD = ncFilesRDD.map(x =>
(x.variables.get("time").get.data.toArray,

x.variables.get("cell_lengths").get.data.toArray,
                            x.variables.get("coordinates").get.data.toArray
                        ))

Thank in advance,

With regards,
-Sanjeev

Reply via email to