Re: [gdal-dev] Does /vsis3 work for HDF4?

2019-05-20 Thread James McClain
Hello,

I don't believe that it does currently.

Originally, support for HDF5 and NetCDF both used the cpl_userfaultfd
mechanism
https://github.com/OSGeo/gdal/blob/master/gdal/port/cpl_userfaultfd.cpp).
The thing that prevented HDF4 from using that mechanism was that its
underlying library did not have the ability to operate on a block of memory
(the cpl_userfaultfd provides a user-mode page fault handler that fills
pages on demand using the VSI mechanism).

It is my recollection that a more direct method was found for HDF5 so that
it no longer uses cpl_userfaultfd; perhaps a similarly more direct
mechanism exists in HDF4?  I do not know whether it does or does not.

Sincerely,
James McClain


On Mon, May 20, 2019 at 2:45 PM Joe Lee  wrote:

> Hi,
>
>   I have a quick question.
>   I know that /vsis3 work for HDF5.
>
>   How about HDF4?
>   Does /vsis3 also work for HDF4?
>
>   Regards,
>
> ___
> gdal-dev mailing list
> gdal-dev@lists.osgeo.org
> https://lists.osgeo.org/mailman/listinfo/gdal-dev



-- 
James McClain
Software Developer
Azavea  |  990 Spring Garden Street, 5th Floor, Philadelphia, PA 19123
___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/gdal-dev

Re: [gdal-dev] GDALWarp Datasets

2019-02-15 Thread James McClain
Hello,

Okay, all of that is well-taken.  Thank you for your response.

> If you're not interested in the pixel values, using
GDALSuggestedWarpOutput2()
is a better fit.

Yes, my question was worded a bit unclearly: I am essentially interested in
retrieving warped tiles so the pixels are of primary interest.

Thank you,
James McClain



On Fri, Feb 15, 2019 at 4:58 AM Even Rouault 
wrote:

> On vendredi 15 février 2019 00:06:28 CET James McClain wrote:
> > Hello,
> >
> > If I may, I would like to a few questions regarding how best to use the
> > GDALWarp function (
> >
> https://github.com/OSGeo/gdal/blob/master/gdal/apps/gdalwarp_lib.cpp#L748-L7
> > 70) and the datasets that it creates.
> >
> > In particular, I am experimenting with using such datasets to read warped
> > extents on an on-demand basis and in a thread-safe way.
> >
> > I have found that the VRT output format is lazy with respect to
> computation
> > and allocation which is great for my use case, but it appears not to be
> > thread safe with respect to reading from a single warped dataset from
> > multiple threads.
>
> No single GDALDataset object of any driver should be used simultaneously
> by
> multiple threads (this extends to any GDAL/OGR class instance). If this
> seems
> to work for GTiff, you're just lucky, but that might just fail as well.
> If you need to access a dataset from several threads, each one should have
> its
> own GDALDataset object (so this won't work for the MEM driver)
>
> >
> > On a more meta level, are datasets created by GDALWarp the right approach
> > (if I want to lazy get warped extents in a thread-safe manner) or should
> I
> > look elsewhere?
>
> If you're not interested in the pixel values, using
> GDALSuggestedWarpOutput2()
> is a better fit.
>
> Even
>
> --
> Spatialys - Geospatial professional services
> http://www.spatialys.com
> ___
> gdal-dev mailing list
> gdal-dev@lists.osgeo.org
> https://lists.osgeo.org/mailman/listinfo/gdal-dev



-- 
"I prayed for freedom for twenty years, but received no answer until I
prayed with my legs."
 -- Frederick Douglass
___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/gdal-dev

[gdal-dev] GDALWarp Datasets

2019-02-14 Thread James McClain
Hello,

If I may, I would like to a few questions regarding how best to use the
GDALWarp function (
https://github.com/OSGeo/gdal/blob/master/gdal/apps/gdalwarp_lib.cpp#L748-L770)
and the datasets that it creates.

In particular, I am experimenting with using such datasets to read warped
extents on an on-demand basis and in a thread-safe way.

I have found that the VRT output format is lazy with respect to computation
and allocation which is great for my use case, but it appears not to be
thread safe with respect to reading from a single warped dataset from
multiple threads.

I have found that the MEM output format is safe with respect to concurrent
reads, but is eager in its allocation and computation.

I have found that the GTIFF output format is also safe with respect to
concurrent reads (but CPU utilization is low so it looks like there is
internal locking), and it is once again eager with respect to allocation
and computation.

What I would like to find is way to create a dataset that is both lazy and
safe with respect to concurrent reads.  Can anyone share any suggestions
with me?

(Note: the foregoing statements are based on experiments that I have done,
not based on looking at the code; if any of the above is incorrect I am
very happy to be corrected.)

On a more meta level, are datasets created by GDALWarp the right approach
(if I want to lazy get warped extents in a thread-safe manner) or should I
look elsewhere?

Thank you,
James McClain

-- 
"I prayed for freedom for twenty years, but received no answer until I
prayed with my legs."
 -- Frederick Douglass
___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/gdal-dev

Re: [gdal-dev] Question on how to open a raster in HDFS using GDAL

2018-11-27 Thread James McClain
You are very welcome, thank you for the report!

On Tue, Nov 27, 2018 at 5:44 PM ZAZHIL-HA HERENA 
wrote:

> Hi there,
>
> Thanks everybody for your valuable help, with your suggestions I finally
> got it working in my cluster with Cloudera 6 distribution(Hadoop 3) with
> Java, I would like to list here a few things I had to do to make it work
> since I used Cloudera to compile:
>
>
>- In Cloudera, Hadoop libraries are under $CLOUDERA_PATH/lib/hadoop so
>this is the path I use for ./configure in --with-hdfs
>- Then* ./configure* is looking for  a file *include/hdfs.h* , this
>directory in Cloudera exists in different path (outside Hadoop)
>$CLOUDERA_PATH/include, so I had to copy it under Hadoop path.
>- Same for libhdfs.so , *make* expects to find it in
>*lib/native/libhdfs.so*, for Cloudera it is in $CLOUDERA_PATH/lib64 ,
>so I just copied it to the expected location.
>
>
> I tested using command line and also my Java application and /vsihdfs/
> works as expected.
>
> Thanks!!!
> Zazhil-ha
> ------
> *From:* gdal-dev  on behalf of James
> McClain 
> *Sent:* Friday, November 23, 2018 3:18 PM
> *To:* gdal-dev@lists.osgeo.org
> *Subject:* Re: [gdal-dev] Question on how to open a raster in HDFS using
> GDAL
>
> Hello,
>
> It may not be finding the native HDFS libraries.  Please see the pull
> request https://github.com/OSGeo/gdal/pull/714 for build instructions (in
> particular, you may need to augment the LD_LIBRARY_PATH environment
> variable).
>
> If trouble persists, I would suggest building against Apache Hadoop 2.7.6
> or 2.7.7 (both of those are know to work) as an experiment.
>
> Sincerely,
> James McClian
>
> On Fri, Nov 23, 2018 at 2:15 PM ZAZHIL-HA HERENA 
> wrote:
>
> Thank you so much!, now I am working on 2.4 source code but I am getting
> an error when trying to configure using:
>
> *./configure
> --prefix=/scratch/zherena/gdal/build/gdal-master/gdal/outputb/
> --with-complete=yes
> --with-java=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.131-11.b12.el7.x86_64/
> --with-swig-java=yes
> --with-hdfs=/scratch/zherena/gdal/CDH-6.0.1-1.cdh6.0.1.p0.590678/
> --with-curl=/usr/bin/curl-config*
>
> The error I get is:
>
>
> *  checking for HDFS in
> /scratch/zherena/gdal/CDH-6.0.1-1.cdh6.0.1.p0.590678/... checking for
> hdfsConnect in -lhdfs... no *
>
> *  checking for
> /scratch/zherena/gdal/CDH-6.0.1-1.cdh6.0.1.p0.590678//include/hdfs.h... yes
> *
> *  configure: error: HDFS support not enabled.*
>
>
> Is there any configuration in my environment that I should consider? or
> maybe another distribution of Hadoop?
>
>
> --
> *From:* Even Rouault 
> *Sent:* Friday, November 23, 2018 11:52 AM
> *To:* gdal-dev@lists.osgeo.org
> *Cc:* ZAZHIL-HA HERENA; James McClain; n...@nikosalexandris.net
> *Subject:* Re: [gdal-dev] Question on how to open a raster in HDFS using
> GDAL
>
> > Version says 2.3.2 but libraries say: libgdal.so.20.4.2 .
>
> Libtool number (.so.20.4.2) has nothing to do with user-friendly version
> number (2.3.2)
>
> > I am not sure if I
> > got the latest code, this is the first time I compile it myself, I used
> > this link to download source code:
> > http://download.osgeo.org/gdal/CURRENT/gdal-2.3.2.tar.gz
>
> This is the latest release, but /vsihdfs/ is in the development version,
> not
> yet released, so download
>
> https://github.com/OSGeo/gdal/archive/master.zip
>
> --
> Spatialys - Geospatial professional services
> http://www.spatialys.com
>
>
>
> --
> "I prayed for freedom for twenty years, but received no answer until I
> prayed with my legs."
>  -- Frederick Douglass
>
>

-- 
"I prayed for freedom for twenty years, but received no answer until I
prayed with my legs."
 -- Frederick Douglass
___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/gdal-dev

Re: [gdal-dev] Question on how to open a raster in HDFS using GDAL

2018-11-23 Thread James McClain
Hello,

It may not be finding the native HDFS libraries.  Please see the pull
request https://github.com/OSGeo/gdal/pull/714 for build instructions (in
particular, you may need to augment the LD_LIBRARY_PATH environment
variable).

If trouble persists, I would suggest building against Apache Hadoop 2.7.6
or 2.7.7 (both of those are know to work) as an experiment.

Sincerely,
James McClian

On Fri, Nov 23, 2018 at 2:15 PM ZAZHIL-HA HERENA 
wrote:

> Thank you so much!, now I am working on 2.4 source code but I am getting
> an error when trying to configure using:
>
> *./configure
> --prefix=/scratch/zherena/gdal/build/gdal-master/gdal/outputb/
> --with-complete=yes
> --with-java=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.131-11.b12.el7.x86_64/
> --with-swig-java=yes
> --with-hdfs=/scratch/zherena/gdal/CDH-6.0.1-1.cdh6.0.1.p0.590678/
> --with-curl=/usr/bin/curl-config*
>
> The error I get is:
>
>
> *  checking for HDFS in
> /scratch/zherena/gdal/CDH-6.0.1-1.cdh6.0.1.p0.590678/... checking for
> hdfsConnect in -lhdfs... no *
>
> *  checking for
> /scratch/zherena/gdal/CDH-6.0.1-1.cdh6.0.1.p0.590678//include/hdfs.h... yes
> *
> *  configure: error: HDFS support not enabled.*
>
>
> Is there any configuration in my environment that I should consider? or
> maybe another distribution of Hadoop?
>
>
> --
> *From:* Even Rouault 
> *Sent:* Friday, November 23, 2018 11:52 AM
> *To:* gdal-dev@lists.osgeo.org
> *Cc:* ZAZHIL-HA HERENA; James McClain; n...@nikosalexandris.net
> *Subject:* Re: [gdal-dev] Question on how to open a raster in HDFS using
> GDAL
>
> > Version says 2.3.2 but libraries say: libgdal.so.20.4.2 .
>
> Libtool number (.so.20.4.2) has nothing to do with user-friendly version
> number (2.3.2)
>
> > I am not sure if I
> > got the latest code, this is the first time I compile it myself, I used
> > this link to download source code:
> > http://download.osgeo.org/gdal/CURRENT/gdal-2.3.2.tar.gz
>
> This is the latest release, but /vsihdfs/ is in the development version,
> not
> yet released, so download
>
> https://github.com/OSGeo/gdal/archive/master.zip
>
> --
> Spatialys - Geospatial professional services
> http://www.spatialys.com
>


-- 
"I prayed for freedom for twenty years, but received no answer until I
prayed with my legs."
 -- Frederick Douglass
___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/gdal-dev

Re: [gdal-dev] Question on how to open a raster in HDFS using GDAL

2018-11-22 Thread James McClain
Hello,

I am the author of the vsihdfs code, I am ready and willing to help.

I just rebuilt it from current master and was able to successfully open an
dataset via an HDFS URI with the GDAL Python bindings.  I have a few
suggestions.

First, please try putting the file into a local directory and try something
like `gdalinfo /vsihdfs/file:/tmp/kahoolawe.tif` to establish a baseline.

Second, if you are using the Python bindings, please make sure that they
have been built and installed (and that you are using the ones that you
built rather than other ones that exist on your system).  Instructions for
building the Python bindings can be found here:
https://trac.osgeo.org/gdal/wiki/BuildingOnUnix .

In my case, after building and installing the library and bindings, I was
able to successfully open a dataset by starting a python REPL like this:

```bash
export
LD_LIBRARY_PATH=$HOME/local/hadoop-2.7.7/lib/native:/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/server:$HOME/local/gdal-master-vsihdfs/lib
export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
export CLASSPATH=$($HOME/local/hadoop-2.7.7/bin/hadoop classpath --glob)
PYTHONPATH=$HOME/local/gdal-master-vsihdfs/lib/python2.7/site-packages
python
```

then typing this into it:

```python
from osgeo import gdal, gdalconst
ds = gdal.Open('/vsihdfs/file:/tmp/testfile.tif', gdalconst.GA_ReadOnly)
```

(I do not have easy access to and HDFS cluster right at the moment, so I
only tested a local HDFS URI.)

A note: After having done a build without HDFS support in the tree, I had
do a `make clean distclean` before I was able to get a build with working
HDFS support.

Sincerely,
James McClain

On Thu, Nov 22, 2018 at 8:13 PM Nikos Alexandris 
wrote:

> * ZAZHIL-HA HERENA  [2018-11-22 22:35:32 +]:
>
> >Hello, I am not sure if I should use this mailing list to ask questions
> but I wanted to try, I am a developer trying to use GDAL to open rasters in
> HDFS.
> >
> >
> >I read in GDAL documentation that starting 2.4 it is possible to open a
> raster in HDFS. I downloaded and compiled the latest source code available
> version and the generated libraries show it is 2.4 (libgdal.so.20.4.2). I
> compiled with option "-with-hdfs=yes" and "--with-java=yes".
> >
> >I am trying to open a raster using:
> >
> >
> >
> >Dataset raster = gdal.Open("/vsihdfs/hdfs://node:8020/user/hdfs
> /spatial_raster/input_raster/kahoolawe.tif", gdalconst.GA_ReadOnly);
>
> Is your path correct? There is a space here (in "/hfds /").
>
> Nikos
>
> >
> >
> >but I am getting the following error: "ERROR 4: No such file or directory"
>
> [rest deleted]
> ___
> gdal-dev mailing list
> gdal-dev@lists.osgeo.org
> https://lists.osgeo.org/mailman/listinfo/gdal-dev



-- 
"I prayed for freedom for twenty years, but received no answer until I
prayed with my legs."
 -- Frederick Douglass
___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/gdal-dev

Re: [gdal-dev] HDF5 and NetCDF now have VSI support

2018-08-27 Thread James McClain
That is good news, I am glad to see that that capability is now available
to more people.

On Sun, Aug 26, 2018 at 3:07 PM Even Rouault 
wrote:

> Hi,
>
> I've pushed a change in master that now enables /vsi support for HDF5 for
> all
> operating systems, and not just Linux >= 4.3. This uses the HDF5 Virtual
> File
> Driver mechanism:
> https://support.hdfgroup.org/HDF5/doc/TechNotes/VFL.html
> which allows to plug a custom low-level I/O layer
>
> (In theory, as netCDF V4 uses HDF5 underneath, this could be available for
> it
> too, but the use of HDF5 is hidden by the netCDF API, so this would
> require
> patching libnetcdf)
>
> Even
>
> --
> Spatialys - Geospatial professional services
> http://www.spatialys.com
>


-- 
"I prayed for freedom for twenty years, but received no answer until I
prayed with my legs."
 -- Frederick Douglass
___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/gdal-dev

Re: [gdal-dev] HDF5 and NetCDF now have VSI support

2018-08-08 Thread James McClain
Ah yes, my apologies.


On Wed, Aug 8, 2018 at 8:30 AM Even Rouault 
wrote:

> > A concrete example: you should probably not setup a
> > /vsimem asset from one thread while another thread is reading a uffd
> backed
> > asset
>
> Just a small correction: James meant here not to mix uffd with the use of
> the
> API in cpl_virtualmem.h ( CPLVirtualMemNew() and the like ).
>
> /vsimem/ itself has nothing to do with cpl_virtualmem.h and so can be used
> with uffd without problem.
>
> Even
>
> --
> Spatialys - Geospatial professional services
> http://www.spatialys.com
>


-- 
"I prayed for freedom for twenty years, but received no answer until I
prayed with my legs."
 -- Frederick Douglass
___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/gdal-dev

[gdal-dev] HDF5 and NetCDF now have VSI support

2018-08-08 Thread James McClain
Hello,

Support for use of /vsi paths to HDF5 and NetCDF files has just been merged
into master (please see here https://github.com/OSGeo/gdal/pull/786 ).

We believe that this will be helpful for many people because it is now
possible to work with remote files of those formats on an on-demand basis
rather than being required to download them.  For example, with an
installation built from current master, is now possible to type

gdalinfo
'/vsis3/nasanex/NEX-GDDP/BCSD/rcp85/day/atmos/tasmax/r1i1p1/v1.0/tasmax_day_BCSD_rcp85_r1i1p1_inmcm4_2100.nc'

and receive the appropriate output, or such a path can be used from within
a program.  All of the various VSI drivers should work (e.g. /vsicurl,
/vsitar, et cetera).

There are a few caveats to mention.  First, this capability makes use of
the userfaultfd (user mode page fault handling) capability in recent Linux
kernels so you will need Linux 4.3 or later.

Also, this capability is read-only at this time.

Also, for NetCDF support, you will need libnetcdf 4.5 or later.

Also, because files are mapped into virtual memory, users might want to
impose a (soft) limit on the number of virtual memory pages that can be
consumed by any given file.  That is done by setting the GDAL_UFFD_LIMIT
configuration option to some integer value indicating the number of pages.

One should be careful when using GDAL_UFFD_LIMIT: because it makes use of
custom signal handlers for SIGSEGV and SIGBUS, it is important not to do
anything in another thread in the same process that also changes the
behaviour of those signals while actively using a dataset backed by the
uffd machinery.  A concrete example: you should probably not setup a
/vsimem asset from one thread while another thread is reading a uffd backed
asset (it is, however, perfectly safe to do use /vsimem from one thread and
uffd-backed assets from another as long as the /vsimem setup [when the
changes to signal handling happen] does not overlap in time with the
uffd-backed reads).

I would like to personally thank Even for his patience and accommodation
during this work and for his contributions to it.

Thanks,
James McClain

-- 
"I prayed for freedom for twenty years, but received no answer until I
prayed with my legs."
 -- Frederick Douglass
___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/gdal-dev