I'm pleased to announce that Project Lizard will be going from bench
prototype to industrial prototype, starting today.
This is thanks to funding from the Technology Strategy Board (of the UK
Government) within the Open Data Tools Initiative. The output is
required to be open source.
Starting with the prototype, there are four areas to progress:
* Query Processing
* Cluster Communications and Deployment
* Bulk Loading
* Operational Support and Monitoring
Development at this stage will be on github.
Contributions to Lizard will be dealt with in a way that is compatible
with a possible future Apache migration.
If you are interested, either as a advanced user with a large database
or in the technology, drop me a note.
Andy
On 01/09/13 19:20, Andy Seaborne wrote:
"Lizard" is a clustered SPARQL system - it's been my August project. The
target is providing fault-tolerant operation; if any one machine fails
(e.g. hardware or software problem), then Lizard is able to continue
providing the SPARQL service.
Lizard assumes machine fail in a "fail-stop" manner -- the machine fails
and stops on an error, and does not generate malicious information, nor
attempt to come alive again without operations intervention. No
Byzantine fault tolerance.
The goal of work in August has been to prototype the overall design. By
putting something together that is the potentially right design, I can
gather experience of using it to learn what does, and does not, work.
* The code is messy, insufficiently tested and inefficient
Some components are written for convenience of
debugging and fast implementation rather than
efficiency.
* Configuration is very hard (magic required)
It is based on TDB and the rest of Jena, but changes the index design to
make it more suitable for clustering,
There is a related new query engine called 'quack'. It uses merge and
hash joins because the indexed join style of single-machine TDB does not
work on a cluster.
https://github.com/afs/proto-lizard
It is not ready for real use.
Andy