Re: accumulo geo

mvangeertruy Fri, 28 Oct 2011 14:53:55 -0700


At Karaf we set up all of the contribs as their own sub- projects. W e release 
them seperately and ensure that the core karaf project doesn't depend on the 
contribs.   This way, if a contrib falls behind , we don't have any depedencies 
and can easily remove them. I would suggest creating an accumulo-geo 
sub-project for this reason .




Mike Van 

ASF Committer 



----- Original Message -----


From: "Todd Lipcon" <[email protected]> 
To: [email protected] 
Cc: [email protected] 
Sent: Friday, October 28, 2011 5:06:48 PM 
Subject: Re: accumulo geo 

Hey Billie, 

A word of warning on contribs: one thing to be wary of is the "drive 
by contribution". We found in Hadoop and HBase that many contribs were 
added to Hadoop as part of a research project or other "passing 
interest", and then not maintained. Since the core committers had very 
little knowledge of the contrib components, and the authors were no 
longer actively maintaining them, they ended up as rotting appendages 
to our codebase. Users would run into issues and then we'd be unable 
to help them work through them - not good for anyone. 

In HBase, we ended up ejecting our contribs to github. This worked out 
well - some have done OK, others have died off. But the ones that died 
off had no maintainers anyway - so better to let them die on their own 
than drag them forward unmaintained in SVN. We've always had the 
stance that, if an HBase-related project on github or elsewhere wants 
to enter contrib, then they can do so provided they have active 
maintainers who are truly committed to long term maintenance. For 
example, our REST server module graduated from contrib into a core 
part of our project, since its maintainers are also HBase committers 
who run the stuff in production. 

Not sure if this is "Apache-like" -- just my opinion as another developer. 

-Todd 

On Fri, Oct 28, 2011 at 1:52 PM, Billie J Rinaldi 
<[email protected]> wrote: 
> Anthony, 
> 
> It sounds interesting.  I have been thinking about how to start fostering a 
> set of contrib projects for Accumulo, but am unsure how we would manage such 
> things effectively (e.g. how do we make sure they work?  are they versioned 
> and released with Accumulo?).  Perhaps we could begin to work this out with 
> your project. 
> 
> Billie 
> 
> 
> ----- Original Message ----- 
>> From: "Anthony Fox" <[email protected]> 
>> To: "Accumulo dev" <[email protected]> 
>> Sent: Wednesday, October 26, 2011 4:30:40 PM 
>> Subject: accumulo geo 
>> All, 
>> 
>> I would like to gauge the interest in an extension to Accumulo to 
>> enable 
>> geospatial capabilities. Currently, I have developed a schema for 
>> storing 
>> raster data as tiles in Accumulo and a plugin to Geoserver that allows 
>> Accumulo tables that use the specified schema to be exposed as WMS 
>> layers 
>> for importing into a GIS. This is a natural fit for Accumulo since the 
>> individual tiles are not large but the aggregate set of tiles that 
>> make up 
>> a single layer can become very large. Accumulo packages those tiles 
>> into 
>> blocks and distributes them around the cloud for quick access and 
>> redundant 
>> storage. The implementation is in an early state. 
>> 
>> I am currently investigating the feasibility of implementing an API 
>> for 
>> storing, querying, and processing vector data in Accumulo. I would 
>> like 
>> the API to be able to answer nearest neighbor queries, perform 
>> on-the-fly 
>> reprojections for queries that come in in a particular projection, 
>> various 
>> standard geospatial transformations such as buffering and finding 
>> intersections, etc. My current thought is that the approach would be 
>> similar to how PostGIS extends Postgres in that it dictates a schema 
>> and 
>> storage format and then provides a user level api (a bunch of sql 
>> functions) for processing that data. PostGIS also provides an r-tree 
>> index 
>> implemented on top of GiST to enable geospatial querying. This type of 
>> functionality is also a natural fit for Accumulo as r-tree minimum 
>> bounding 
>> rectangles can map to tablet extents. However, this change would 
>> require 
>> modifications to core functionality. Some mechanism for hooking in 
>> alternative 'extents' may be a technique for dealing with this kind of 
>> indexing scheme. 
>> 
>> Is there any interest in these kinds of geospatial processing 
>> capabilities 
>> in the Accumulo community and has anyone thought about/implemented 
>> some 
>> geospatial functions? 
>> 
>> Thanks, 
>> Anthony 
> 



-- 
Todd Lipcon 
Software Engineer, Cloudera

Re: accumulo geo

Reply via email to