Hi there,
Following the discussion thread, here is the formal vote on the Marmotta
proposal:
Please cast your votes on whether to accept the Apache Marmotta proposal:
[ ] +1 Accept Marmotta into the Apache Incubator
[ ] +0 Indifferent to the acceptance of Marmotta
[ ] -1 Do not accept the Marmotta proposal because ...
The vote will be open until at least 23:59 Sunday 2nd December UTC
(which is three full days from midnight tonight)
Andy
http://wiki.apache.org/incubator/MarmottaProposal
-----------------------
== Abstract
Marmotta is a Linked Data platform for industry-strength installations.
== Proposal
The goal of Apache Marmotta is to provide an open implementation of a
Linked Data Platform that can be used, extended, and deployed easily by
organizations who want to publish Linked Data or build custom
applications on Linked Data.
The phrase "Linked Data" is used here idiosyncratically to refer to a
data integration paradigm across the Web. The term was coined by Tim
Berners-Lee in 2006, and it is based on four very simple principles
which basically describe recommended best practices for exposing,
sharing, and connecting pieces of data, information, and knowledge on
the Semantic Web using URIs and the RDF technology stack. Therefore
Linked Data is about using the Web to connect related data that wasn't
previously linked, or using the Web to lower the barriers to linking
data currently linked using other methods.
Marmotta will follow the core recommendations of the W3C on RDF, SPARQL
and Linked Data publishing, particularly the emerging Linked Data
Platform (LDP) recommendation. It will also offer extensions for
frequently needed additional functionalities like Linked Data Querying,
WebID, WebACL, Reasoning, and Versioning. Marmotta aims to cover both,
Linked Open Data, as well as Enterprise Linked Data scenarios, providing
facilities to deal with different data sources and requirements (small
data/big data, open access/restricted access, etc).
== Background
The Semantic Web isn't just about putting data on the web. It is about
making links, so that a person or machine can explore the web of data.
Moreover, the Web has quickly evolved to a Read-Write paradigm, and
Linked Data technologies too. And Marmotta will address this challenge
and offer a common infrastructure for organizations working in this area.
Marmotta comes as a continuation of the work in the Linked Media
Framework (aka LMF) project. LMF is an easy-to-setup server application
that bundles central Semantic Web technologies to offer some advanced
services. The Linked Media Framework consists of LMF Core which provides
a Read-Write Linked Data server, plus some modules that complement the
server with other added added capabilities, such as, SPARQL 1.1, LDPath,
LDCache, Reasoning, Versioning, etc. Besides, LMF also provides a Client
Library, currently available in Java, PHP, and Javascript, as a
convenient API abstraction around the LMF web services. Currently LMF
integrates with other relevant tools (Apache Stanbol, Google Refine or
Drupal) to cover a wider range of use cases and needs.
== Rationale
Linked Data technologies are now at a turning point from mostly research
projects to industrial applications, and a lot of standardisation is
currently in progress. Industrial applications require a reliable and
scalable infrastructure that follows and helps defining a standard way
of publishing and consuming Linked Data on the Web. The proposers have a
strong background in building such applications and have invested
considerable effort in the last years to building up an initial version
of such a platform (the “Linked Media Framework” or “LMF”). Starting
from this solid base, we strongly believe that Apache is the right
environment to open the development of this project to a wider scope.
Marmotta has the potential of being a reference implementation and
Apache provides a better environment for a collaborative development
effort. With its well-established governance model based on meritocracy
and handling IP/legal issues, people from different organizations can
more easily contribute to the project. This will help unify the efforts
of people implementing the Linked Data Platform specification and other
Semantic Web standards. In addition, it would considerably help
organizations in adopting Linked Data technologies and would provide a
solid base for further research activities in the community.
== Initial Goals
* Foster the use of Semantic Web Technologies in industry
* Provide an open source and community-driven implementation of a Linked
Data Platform and related Semantic Web standards, LDP 1.0 Draft and
SPARQL 1.1 mainly
* Move the existing LMF source from the current Google Code page to the
Apache infrastructure
* Remove LMF extensions that are not relevant for a core Linked Data
platform (e.g. semantic search and content enhancement)
* Define a plugable architeture for providing a data governance
framework for enterprise legacy sources
* Revise the architecture, moving to a non-proprietary RDF API (Sesame
or Jena) and deciding whether to move to OSGi/Felix or stay with
CDI/JavaEE as SOA framework
* Identify and replace dependencies with a non-compatible license (e.g.
replace XOM with JDOM)
== Current Status
The source for the current LMF is a stable software artifact that,
having emerged from research circles, has already a relevant number of
real world installations i.e. Red Bull Media House, Salzburger
Nachrichten, derStandard.at, etc.
== Meritocracy
LMF is the outcome of a number of research projects
coordinated/participated by Salzburg Research during the last five
years. The original developers are still part of the core development
team, while at the same time many new committers have joined the team.
Taking this step we have made it clear to our community that going
forward, the community, rather than a single organization, will
determine the future of Marmotta.
Meritocracy is inherent in the research community we come from, and
since Apache Marmotta aims to be a unifying project for this community
it is only natural to continue this approach.
== Community
Marmotta addresses two target communities: On the one hand,
researchers/developers who are working with Semantic Web technologies.
On the other hand, companies or organizations that require Semantic Web
infrastructure. The initial committers are active participants in both
communities.
== Core Developers
Sebastian Schaffert (sebastian dot schaffert at salzburgresearch dot at)
Thomas Kurz (thomas dot kurz at salzburgresearch dot at)
Jakob Frank (jakob dot frank at salzburgresearch dot at)
Dietmar Glachs (dietmar dot glachs at salzburgresearch dot at)
Sergio Fernández (sergio dot fernandez at salzburgresearch dot at)
== Alignment
Marmotta complements and integrates well with the current landscape of
Apache projects, especially with the emerging “semantic technologies”
cluster within the ASF. Concretely, Marmotta will align with the
following projects:
* Apache Commons (lang, loggging, http and so on) is extensively used in
many part of the project
* Apache Tomcat is currently the primary platform for deployment; with
Marmotta, Tomcat can be turned into a Linked Data server
* Apache Stanbol will very likely adopt parts of the Marmotta
infrastructure, particularly for implementing the entity hub and for
exposing the RDF data as Linked Data
* Apache Jena could become the RDF API used throughout Marmotta; an
architecural decision is yet to be taken
* Apache Any23 could be integrated in the LMF as wrapper around non-RDF
data sources to consume them as Linked Data; a similar approach has
already been taken by the LMF
* Apache Tika could be use for metada extraction of content
* Apache Karaf and Apache Felix could become the OSGi container for
running and configuring the Marmotta components
In addition to these more-or-less concrete proposals, there are some
options that still require some strategic decisions. For example, it
make make sense to build a storage backend based on Apache Hadoop for
large-scale installations using HBase (e.g. jena grande, h2rdf, hdrs,
hadoop rdf). Several extensions also build on existing Apache projects,
most importantly the LMF Semantic Search component, which offers
semantic search over Linked Data resources.
== Known Risks
Probably one of the major risks will not be able to engage the community
for addressing the new challenges. Knowing this, we will do our best to
provide the greater facilities to attract new developers and
organizations. In particular, we will try to actively engage developers
from the Linked Data community through our networks.
== Orphaned Products
The current project is part of the business portfolio and a strategic
project of the contributor organization, and will continue in that way.
So there is no risk of any of the usual warning signs of orphaned or
abandoned code.
== Inexperience with Open Source
The committers have large experience with open source development and
communities. Several of the key committers have been actively involved
in Open Source projects for more than 10-15 years. The initial code base
of Marmotta has already been developed as Open Source project in the
last 5 years.
== Homogenous Developers
Because we are aware about the initial list of committers is not the
best for a long, it exists a strong commitment to spread the project
creating a much more diverse development team. Part of the reason to
enter the Apache incubation process is to open up the development to
more interested participants.
== Reliance on Salaried Developers
Right now most or all of that work is salaried, but the developers are
identifying themselves very much with the project. When opening up the
development using Apache as a platform, we expect that the future
development will occur on both salaried and volunteer time, particularly
by participants from the Linked Data community.
== Relationships with Other Apache Projects
Although current RDF/SPARQL support in LMF is build on top of OpenRDF
Sesame API, Marmotta is closely related to many Apache projects, such as
Stanbol, Jena and Any23. See “Alignment” above.
== An Excessive Fascination with the Apache Brand
While we expect the Apache brand may help attract more contributors, our
interests in starting this project is based on the factors mentioned in
the Rationale section.
== Documentation
Documentation for the current project can be found at:
http://lmf.googlecode.com
http://doc.lmf.googlecode.com/hg/api/index.html
http://doc.lmf.googlecode.com/hg/rest/index.html
http://doc.lmf.googlecode.com/hg/client/index.html
== Initial Source
LMF (formerly KiWi) has been developed since 2008. It is important to
say that the whole LMF will not be contributed to Marmotta, actually
only those parts that make up the "Linked Data Platform" functionality
(Linked Data Server, RDF Store, SPARQL, LDCache, Versioning, Reasoner
and LDPath) . The idea is to focus Marmotta much more in the core needs,
keeping all surrounding functionalities (Media-related modules and
Semantic Search, basically) out of the initial scope. Although the
community will be who ultimately decides what are the relevant modules.
Since LMF is a very modular software artifact it will be pretty easy to
make such partitioning to kick-off Marmotta.
The current source code can be found at Google Code:
http://lmf.googlecode.com
== Source and Intellectual Property Submission Plan
Salzburg Research Forschungsgesellschaft mbH is the sole copyright owner
of the initial code to be contributed, so should not be any problem with
the standard IP clearance process. Current licence is already Apache
Software License 2.0.
== External Dependencies
Most of current dependencies should have Apache compatible licenses,
including BSD, CDDL, CPL, MPL and MIT licensed dependencies. We are
aware of some incompatible licenses right now, but we will work to solve
this issue. See Appendix A for a detailed list of dependencies.
== Cryptography
Does Not Apply.
== Required Resources
Mailing lists
marmotta-dev
marmotta-commits
marmotta-users
Repository
git://git.apache.org/marmotta.git
Issue Tracking
Jira: MARMOTTA (Kanban board enabled at GreenHopper)
Other Resources
Jenkins/Hudson for builds and test running.
Wiki for internal documentation purposes
Blog to improve the project dissemination
== Initial Committers
Sebastian Schaffert
(sebastian dot schafftert at salzburgresearch dot at)
Thomas Kurz
(thomas dot kurz at salzburgresearch dot at)
Jakob Frank
(jakob dot frank at salzburgresearch dot at)
Dietmar Glachs
(dietmar dot glachs at salzburgresearch dot at)
Sergio Fernández
(sergio dot fernandez at salzburgresearch dot at)
Rupert Westenthaler
(rwesten at apache dot org)
== Affiliations
All initial committers are currently affiliated to Salzburg Research
Forschungsgesellschaft mbH.
== Sponsors
= Champion
Andy Seaborne (andy at apache dot org)
= Nominated Mentors
Fabian Christ (fchrist at apache dot org)
Nandana Mihindukulasooriya (nandana at apache dot org)
Andy Seaborne (andy at apache dot org)
= Sponsoring Entity
Apache Incubator PMC
---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org