Hi Chris,

I will check that out.

Thanks!
Travis




On Sun, Aug 25, 2013 at 11:13 PM, Mattmann, Chris A (398J) <
[email protected]> wrote:

> Guys, I did GDAL bindings for Tika in TIKA-605 by building the Java
> JAR bindings -- I think it's a good route (but the problem is that the
> Jar isn't in Maven Central).
>
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Chris Mattmann, Ph.D.
> Senior Computer Scientist
> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
> Office: 171-266B, Mailstop: 171-246
> Email: [email protected]
> WWW:  http://sunset.usc.edu/~mattmann/
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Adjunct Assistant Professor, Computer Science Department
> University of Southern California, Los Angeles, CA 90089 USA
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>
>
>
>
>
>
> -----Original Message-----
> From: Travis L Pinney <[email protected]>
> Reply-To: "[email protected]" <[email protected]>
> Date: Sunday, August 25, 2013 6:47 PM
> To: dev <[email protected]>
> Subject: Re: Proposal for merging Shapefile branch to trunk
>
> >Hi Adam and Martin,
> >
> >Would it be ok to leave it as is because there are a small number of data
> >storage modules currently? I think of storage as something that holds
> >common formats that run across all the different storage formats, like a
> >Feature. Eventually it will get to the point where you will not want to
> >have a multitude of jar files. I see the sis-shapefile as a fairly
> >distinct
> >file driver because of the complex format of a shapefile (not necessarily
> >good complexity).
> >
> >Adding GDAL bindings for commons formats would be very useful. This would
> >make it easier to do large bulk processing of geospatial data with Hadoop
> >like the presentation in the following video:
> >
> >https://www.youtube.com/watch?v=_JCPf89s-NI
> >
> >
> >Thanks,
> >Travis
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >On Sun, Aug 25, 2013 at 8:06 PM, Adam Estrada
> ><[email protected]>wrote:
> >
> >> Hey Martin,
> >>
> >> Regarding where to put all the file format modules, I am just concerned
> >> that it might be difficult to keep things straight if there is a
> >>mixture of
> >> "complex" formats and everything else. I think we all trust your
> >>opinion on
> >> where to put things but we really just need to keep the end user and
> >>other
> >> potential committers in mind when moving forward in the development
> >> process. For example, I take a look at the directory structure in SVN
> >>[1]
> >> and I automatically think that each format should be in its own module
> >>like
> >> sis-netcdf because of the way it's organized.
> >>
> >> Just my 2 cents at this point and feedback from other folks is certainly
> >> welcome :)
> >>
> >> Adam
> >>
> >> [1] https://svn.apache.org/repos/asf/sis/trunk/storage/
> >>
> >>
> >> On Sun, Aug 25, 2013 at 4:39 PM, Martin Desruisseaux <
> >> [email protected]> wrote:
> >>
> >> > Hello Adam
> >> >
> >> > Le 25/08/13 21:34, Adam Estrada a écrit :
> >> >
> >> >  It is true that the Shapefile is very widely used but it has lots and
> >> lots
> >> >> of limitations. The main one that I can think of is that it can't
> >>handle
> >> >> UTF-encoded characters in the attribute table. Can I suggest maybe
> >> working
> >> >> towards something like an "interchange" module where all the file
> >> formats
> >> >> live?
> >> >>
> >> >
> >> > I agree with all the above, and in the current SIS state the
> >> "interchange"
> >> > module is actually the "storage" group of modules. This group of
> >>modules
> >> > currently contains:
> >> >
> >> >  * sis-storage: provides the basis common to all formats.
> >> >  * sis-netcdf: for the NetCDF format.
> >> >
> >> >
> >> > My concern is about whether we should put the Shapefile code in its
> >>own
> >> > "sis-shapefile" module (which would depend on "sis-storage"), or put
> >>it
> >> > straight in "sis-storage".
> >> >
> >> > One extreme view is to adopt a "one format == one module" policy. But
> >>in
> >> > Geotoolkit.org, this policy resulted in more than 120 modules, some of
> >> them
> >> > with very few classes. In security constrained environment, where
> >>every
> >> JAR
> >> > files requires its own SecurityManager policies, this is very tedious.
> >> >
> >> > Consequently, I would like to group some formats in the same JAR
> >>files in
> >> > order to keep the amount of modules to a reasonable number. Then, the
> >> > question would be which granularity to choose. My proposal is to not
> >>put
> >> > every format in its own module, but put a format in its own module if
> >>it
> >> > meets some of the following conditions:
> >> >
> >> >  * The format is not widely used.
> >> >  * The format is complex, so it requires a large number of classes or
> >> >    resources.
> >> >  * The format depends on an external library or on native code.
> >> >
> >> >
> >> > The NetCDF format is proposed in its own module because it is complex
> >> (the
> >> > classes currently in "sis-netcdf" are just scratching the surface) and
> >> may
> >> > have a dependency to a large library (while I would like to keep that
> >> > dependency optional). Shapefile on the contrary is relatively simple
> >>and
> >> > needs no external dependency.
> >> >
> >> > Given that "sis-storage" would be the basis of all formats in SIS, my
> >> > proposal is to put also in "sis-storage" some formats considered as
> >> > "fundamental ones", I mean some formats so widely spread that any
> >>users
> >> are
> >> > very likely to meet them. They would not be the only or "main" SIS
> >> formats
> >> > - they would rather be the "minimal requirements". Other modules like
> >> > "sis-netcdf" would provide more elaborated formats.
> >> >
> >> >
> >> >
> >> >  For vector data, there are quite a few of them out there. OGR
> >> >> references many of them [1] but that opens the debate on whether or
> >>not
> >> to
> >> >> just use GDAL. I suppose we could just have GDAL support as a module
> >> which
> >> >> would require some sort of JNI bindings to work in a pure Java
> >>library
> >> >> like
> >> >> SIS. What are your thoughts on this?
> >> >>
> >> >
> >> > Yes, this is also the plan :-). We already used GDAL through JNI on
> >>our
> >> > side, and that code is also part of the proposed migration to SIS. The
> >> > approach that I would recommend is to use pure Java code for many
> >>formats
> >> > (Shapefile, ASCII grid, GeoTIFF, NetCDF, PNG), and fallback on GDAL
> >>as a
> >> > complement for other formats.
> >> >
> >> > A similar argument apply to Coordinate Transformation Services. We
> >>have
> >> > pure Java code (their port to SIS started last week, beginning with
> >>WKT),
> >> > but we plan to support Proj.4 through JNI even for map projections
> >> > available in pure Java, because in some situations a user may need the
> >> > guarantees to get the exact same results than PostGIS or MapServer for
> >> > instance (those products are built on top of Proj.4).
> >> >
> >> >     Martin
> >> >
> >> >
> >>
>
>

Reply via email to