I have created a (now empty) space on the OSGeo wiki to start to fill in concrete details that come out of this discussion at http://wiki.osgeo.org/index.php/Geodata_formats. Please use the wiki to put your wishlists for a new open data format, lists of existing data formats with links to their specifications etc in the wiki. Please join the Geodata Mailing list (http://www.osgeo.org/geodata) and continue this thread with debate and discussion relating to a new format on that list as I believe it is a more appropriate venue.
David On Nov 13, 2007 12:55 PM, P Kishor <[EMAIL PROTECTED]> wrote: > David, > > > On 11/13/07, David William Bitner <[EMAIL PROTECTED]> wrote: > > Part of the mission of the OSGeo Geodata committee > > (http://www.osgeo.org/geodata) is to "promote the use of open geospatial > > formats". If there is a group that wants to continue pursuing the > creation > > of a new open geodata format, I would like to encourage the use of the > > geodata mailing list. That being said, I think part of the discussion > that > > needs to be had is whether or not OSGeo should be creating standards in > the > > first place. > > > > A couple comments that I have on some of the discussion that has taken > place > > in this thread: > > > > Regarding the suggestion that MapServer takes on this new format as the > > primary format: I think this is way beyond the scope of what OSGeo > should > > be doing. Even if we spec a new standard, we (OSGeo) have no teeth to > be > > able to make any of our projects do any kind of implementation of that > > standard. The choice of formats that are used by any of our projects is > > driven by the needs of the users and developers and the resources (time, > > money) that have been dedicated towards implementing them. If someone > takes > > OpenShape or whatever and decides they have a business need that they > can > > spend the time or money to get it implemented then it will be > implemented. > > Shapefile has and will continue to be an important format for many > projects > > as it is one of, if not the most distributed formats in the GIS world. > > > > I respectfully disagree. I think OSGeo has plenty teeth for those who > want to believe in it. In the end, yes, just like any real project, it > needs a core of committed developer and plenty of time (or money -- > usually they are synonymous). This is not something that can happen > overnight, but if good, it deserves a start and support. That the > long, long-term effects of a solid, relational, transactional, geodata > format would be very good is a reasonable assumption for me. > > > Regarding the comments on standards wanking: Standards can get in the > way > > of progress along a straight line, but they can also encourage > > interoperability that can create better progress for everyone. To get a > > singular task done, standards often can slow things down, but there > *are* > > gains to be had from playing well with everyone else. > > Here I totally agree. I am not sure how to interpret the "standards > wanking" statement. On the one hand it is a reasonably accurate > assessment of a lot of public hand-wringing and open alliances (for a > really funny take on this, read Fake Steve's tirade on the open > handset alliance at > <http://fakesteve.blogspot.com/2007/11/its-not-phone-its-alliance.html>). > But, on the other hand, it is a pretty damning judgment on any attempt > to do things via collaboration, and thus, on OSGeo and such efforts > itself. > > My take is that if I can't do it alone, I will lay it out in the open > hoping someone better than me will work on it as well. If I can do it > alone, I will do it until I think it is ready to benefit from extra > eyeballs. Sometimes getting started is the biggest hurdle. > > > > > > David Bitner > > OSGeo, Public Geospatial Data Project Chair > > > > On Nov 13, 2007 11:40 AM, Allan Doyle <[EMAIL PROTECTED]> wrote: > > > > > > > > > On Nov 13, 2007, at 12:24 , Steve Coast wrote: > > > > > > > OSM: $0 > > > > CCBYSA: $0 > > > > Donation of entire Netherlands: Priceless > > > > > > > > Real artists ship. For everyone else there's standards wanking. > > > > > > Perhaps there's an art to wanking standards as well. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Seriously though, this is so kafka-esque. When OSM started it was > > > > like this: We should have got a committee to design a standard, then > > > > we could think about a committee to design an ontology... and choose > > > > a name... and on some sunny distant day make a map. > > > > > > > > > > > > > > > > On 13 Nov 2007, at 17:09, P Kishor wrote: > > > > > > > >> On 11/13/07, Landon Blake <[EMAIL PROTECTED]> wrote: > > > >>> Puneet, > > > >>> > > > >>> You wrote: "Should be easy to transition to. By building the new > > > >>> format > > > >>> on the > > > >>> structure of the Shapefile format, and *in fact*, calling it "open > > > >>> shapefiles" or some such thing, we indicate from its name that the > > > >>> transition is not that revolutionary but is evolutionary. This, > > > >>> hopefully, will bring some name-familiarity, and make the > transition > > > >>> less scary." > > > >>> > > > >>> I really think you are going to run into problems using the > > > >>> "Shapefile" > > > >>> as part of the trademark or name for any product not sold by ESRI. > I > > > >>> strongly recommend against this move. Let people adopt the > > > >>> implementation of your idea for its merits, not for name > recognition > > > >>> that comes from another product line. > > > >> > > > >> Good enough point to keep in mind, but not to get hung up over > enough > > > >> to entangle us. Suggestions for names of the data format can be a > > > >> project in itself. "open spatial data format" or its variations > could > > > >> be chosen. Still, point taken. > > > >> > > > >>> > > > >>> You wrote: "ANSI standard C is still > > > >>> that magic common denominator that compiles and works predictably > on > > > >>> most number of systems. I have a lot against Java, but those who > > > >>> love > > > >>> Java should definitely work on tools for accessing and working > with > > > >>> this new format as it would only make the format more widely used > > > >>> and > > > >>> adopted." > > > >>> > > > >>> It sounds to me like you are really describing a tool. File > > > >>> formats are > > > >>> written in a binary encoding or text, not in a programming > > > >>> language. If > > > >>> you are designing a tool you can choose the programming language > > > >>> of your > > > >>> choice, but be aware that this will limit the developers that > > > >>> adopt the > > > >>> tool. This will be the case no matter what language you choose to > > > >>> use, > > > >>> whether it is C, Java, or something else. > > > >>> > > > >>> If, in contrast, you are creating a file format, then programming > > > >>> languages shouldn't really matter. Binary and text data can be > > > >>> accessed > > > >>> by almost all programming languages. > > > >>> > > > >>> I think you need to decide if you want a tool or a data format. It > > > >>> sounds like you are shooting more for a spatial database written > > > >>> in the > > > >>> C programming language that uses some form of the ESRI Shapefile > > > >>> as its > > > >>> underlying data storage mechanism. To me that is a tool or piece > of > > > >>> software, not a format. But maybe I don't completely understand > your > > > >>> goal. > > > >>> > > > >> > > > >> well, I am, frankly confused. > > > >> > > > >> I was quite convinced I wasn't describing a "tool" but was > describing > > > >> a "format." Of course, to describe the format, I positioned it on > the > > > >> "format" (the SQLite-compatible format) used and popularized by a > > > >> "tool" (SQLite, the library, which happens to be written in C). In > my > > > >> mind, having the data format based on SQLite *format* for its > > > >> relational attribute handling was the real winner. In that sense, > > > >> perhaps I conflated the format and the tool. I am not well versed > in > > > >> these things to I am probably already walking on thin ice, but that > > > >> shouldn't stop others. > > > >> > > > >> So, forget that I mentioned C and Java... let's just concentrate on > a > > > >> way of laying out data on a disk that is not too dissimilar from > how > > > >> Shapefile data are laid out, except that we utilize the > > > >> SQLite-compatible binary format for relational data handling, so > that > > > >> SQLite-enabled spatial tools can access this new format. > > > >> > > > >> And, put this format into public domain. > > > >> > > > >> > > > >>> > > > >>> -----Original Message----- > > > >>> From: [EMAIL PROTECTED] > > > >>> [mailto:[EMAIL PROTECTED] On Behalf Of P Kishor > > > >>> Sent: Tuesday, November 13, 2007 8:35 AM > > > >>> To: OSGeo Discussions > > > >>> Subject: [OSGeo-Discuss] Re: idea for an OSGeo project -- a > new,open > > > >>> data format > > > >>> > > > >>> Thanks everyone, for responding. Here is my "groundwork." > > > >>> > > > >>> The new format -- > > > >>> > > > >>> - Should be fast. SQLite is plenty fast, and anything that simply > > > >>> "extends" the Shapefile format to inject relational capabilities > > > >>> should be pretty fast. It should definitely be faster than a > > > >>> geodatabase format (such as PostGIS/ArcSDE) and perhaps even > faster > > > >>> than Shapefiles especially while accessing attribute data. DBF is > > > >>> sequential, and searching for textual information is particularly > > > >>> expensive. SQLite has been tuned to excellence. I have been > working > > > >>> with it for a few years now, and it really is an amazing product, > > > >>> development community, support, and capabilities. That it is in > > > >>> public > > > >>> domain makes for a transfat-free icing on the cake. > > > >>> > > > >>> - Should be unencumbered by licenses and copyrights. Ideally, the > > > >>> new > > > >>> format could also be put back into public domain. We want to > remove > > > >>> all encumbrances to encourage rapid and wide adoption. > > > >>> > > > >>> - Should be a single file. Well, some like multiple files and some > > > >>> like single files. We can achieve both objectives by using a > > > >>> tar-gzipped packaging such as Apple tends to use for much of its > > > >>> stuff > > > >>> (for example, its Pages wordprocessor uses a tgzipped xml file > along > > > >>> with other resources for icons and pictures and stuff). Or, if > speed > > > >>> is going to be affected because of gzipping and gunzipping, just a > > > >>> package format (I have no idea if this is a Unix thing or a Mac OS > > > >>> thing -- we, in the Mac world, call them packages... they appear > > > >>> like > > > >>> files in the Finder, and like directories in the shell). > > > >>> > > > >>> - Should be easy to transition to. By building the new format on > the > > > >>> structure of the Shapefile format, and *in fact*, calling it "open > > > >>> shapefiles" or some such thing, we indicate from its name that the > > > >>> transition is not that revolutionary but is evolutionary. This, > > > >>> hopefully, will bring some name-familiarity, and make the > transition > > > >>> less scary. > > > >>> > > > >>> - Frank mentions SQLite's lack of datatypes as an issue -- I guess > > > >>> that is a matter of preference. I personally quite like that > freedom > > > >>> as it gives me, the application developer, complete control over > > > >>> what > > > >>> goes where. SQLite actually does have now a few datatypes that it > > > >>> respects, but doesn't complain about. Since all users will be > > > >>> accessing the data via an application, as long as the application > is > > > >>> well defined, it should be fine. > > > >>> > > > >>> - SQLite excels at one thing that it has been entrusted to do -- > > > >>> retrieve data that it has been entrusted with at extremely fast > > > >>> speeds, and maintain ACID data integrity in case of a programmatic > > > >>> catastrophe. The transactions themselves are worth their price of > > > >>> admission, which, happily, happens to be zero. > > > >>> > > > >>> - Langdon mentions Java support -- well, yes, use/work on SQLite > > > >>> JDBC. > > > >>> I have been using it for a few days now and find it to be a pretty > > > >>> competent conduit. Extend it, spatialize it. ANSI standard C is > > > >>> still > > > >>> that magic common denominator that compiles and works predictably > on > > > >>> most number of systems. I have a lot against Java, but those who > > > >>> love > > > >>> Java should definitely work on tools for accessing and working > with > > > >>> this new format as it would only make the format more widely used > > > >>> and > > > >>> adopted. > > > >>> > > > >>> Ok, enough for now. > > > >>> > > > >>> > > > >>> > > > >>> On Nov 13, 2007 8:52 AM, P Kishor <[EMAIL PROTECTED]> wrote: > > > >>>> So, I am thinking, Shapefile is the de facto data standard for > GIS > > > >>>> data. That it is open (albeit not Free) along with the deep and > > > >>>> wide > > > >>>> presence of ESRI's products from the beginning of the epoch, it > has > > > >>>> been widely adopted. Existence of shapelib, various language > > > >>>> bindings, > > > >>>> and ready use by products such as MapServer has continued to > cement > > > >>>> Shapefile as the format to use. All this is in spite of > Shapefile's > > > >>>> inherent drawbacks, particularly in the area of attribute data > > > >>>> management. > > > >>>> > > > >>>> What if we came up with a new and improved data format -- call it > > > >>>> "Open Shapefile" (extension .osh) -- that would be completely > Free, > > > >>>> single-file based (instead of the multiple .shp, .dbf, .shx, > etc.), > > > >>>> and based on SQLite, giving the .osh format complete relational > > > >>>> data > > > >>>> handling capabilities. We would require a new version of > Shapelib, > > > >>>> improved language bindings, make it the default and preferred > > > >>>> format > > > >>>> for MapServer, and provide seamless and painless import of > regular > > > >>>> .shp data into .osh for native rendering. Its adoption would be > > > >>>> quick > > > >>>> in the open source community. The non-opensource community would > > > >>>> either not give a rat's behind for it, but it wouldn't affect > > > >>>> them... > > > >>>> they would still work with their preferred .shp until they > learned > > > >>>> better. By having a completely open and Free single-file based, > > > >>>> built > > > >>>> on SQLite, fully relational dbms capable spatial data format, it > > > >>>> would > > > >>>> be positioned for continued improvement and development. > > > >>>> > > > >>>> Is this too crazy? > > > >>>> > > > >>>> -- > > > >>>> Puneet Kishor > > > >>>> > > > >>> _______________________________________________ > > > >>> Discuss mailing list > > > >>> [email protected] > > > >>> http://lists.osgeo.org/mailman/listinfo/discuss > > > > > >>> > > > >>> > > > >>> Warning: > > > >>> Information provided via electronic media is not guaranteed > > > >>> against defects including translation and transmission errors. If > > > >>> the reader is not the intended recipient, you are hereby notified > > > >>> that any dissemination, distribution or copying of this > > > >>> communication is strictly prohibited. If you have received this > > > >>> information in error, please notify the sender immediately. > > > >>> _______________________________________________ > > > >>> Discuss mailing list > > > >>> [email protected] > > > >>> > > http://lists.osgeo.org/mailman/listinfo/discuss > > > >>> > > > >> _______________________________________________ > > > >> Discuss mailing list > > > >> [email protected] > > > >> http://lists.osgeo.org/mailman/listinfo/discuss > > > > > >> > > > > > > > > have fun, > > > > > > > > SteveC | [EMAIL PROTECTED] | http://www.asklater.com/steve/ > > > > > > > > > > > > _______________________________________________ > > > > Discuss mailing list > > > > [email protected] > > > > > > http://lists.osgeo.org/mailman/listinfo/discuss > > > > > > -- > > > Allan Doyle > > > Director of Technology > > > MIT Museum > > > +1.617.452.2111 > > > > > > > > > > > > > > > > > > > > > > > > _______________________________________________ > > > Discuss mailing list > > > [email protected] > > > > > http://lists.osgeo.org/mailman/listinfo/discuss > > > > > > > > > > > -- > > ************************************ > > David William Bitner > -- ************************************ David William Bitner
_______________________________________________ Discuss mailing list [email protected] http://lists.osgeo.org/mailman/listinfo/discuss
