Hello Fedora Community,
We have concluded the fourth sprint within the "Beta Phase" of Fedora4
development. The work of this and each sprint comes completely from the
contribution of Fedora stakeholder institutions in allocating developer
time. If you would like to become involved with Fedora4 development, please
send me an email. If you have comments on the work from this sprint, please
send an email or comment directly on the wiki.

Below is a link to the Sprint B4 summary (also included in-full in this
email's body):
https://wiki.duraspace.org/display/FF/Sprint+B4+Summary

Andrew Woods


================
Development Team
================

Michael Durbin - University of Virginia
Greg Jansen - University of North Carolina, Chapel Hill
Esme Cowles - University of California, San Diego
Eric James - Yale University
Ye Cao - Max Planck Digital Library
Matthias Walter - Max Planck Digital Library
Markus Haarländer - Max Planck Digital Library

============
Sprint Themes
============

--------------------
1) Fedora 3 to 4 Upgrade

As a follow-on to the initial Fedora 3 to 4 tooling introduced in Sprint B3
[1], this sprint added several new repository migration features:

Fedora 3 objects and datastreams are accessible through the Fedora 4
connector
Initial set of Fedora 3 object and datastream properties are exposed
through the Fedora 4 connector

This work represents the first step in a repository migration, namely,
creating a Fedora 4 connector that is able to read from and expose the
content of a Fedora 3 repository. Upon the completion of this connector
effort, the second step will consist of enabling the migration of the
objects and (optionally) datastreams into the direct management of the
Fedora 4 repository.

More description of the Fedora 3 to 4 Upgrade design [2] and concept
mapping [3] can be found on the wiki.

--------------------
2) Triplestore

Given the foundation laid in Sprint B2 [4], this sprint has cleaned both
the implementation and the accompanying documentation to encourage the
community to set up Fedora 4 with an external triplestore and provide
feedback. There are three components involved:

A Fedora 4 repository (notes [5] to build from source)
A triplestore (setup notes [6] for both Fuseki and Sesame)
A stand-alone repository message listener / triplestore indexer (notes [7]
to build from source)

In order to keep the triplestore implementation decoupled from the Fedora 4
core, the pattern has been established to have a configurable repository
message-listener component that transforms the messages before feeding them
to the external triplestore index. Additional notes on how to perform
actions on the repository and subsequently query the triplestore to inspect
the results can be found on the wiki [8].

--------------------
3) Search

Following the same pattern of decoupled integration employed in the
triplestore work, initial design [9] towards a search index was started
during this sprint. The approach is to consume repository messages, then
feed them to an external search index. The basic flow was proven in this
sprint, leaving the follow-on work of implementing the configurable
transformation of Fedora 4 object properties into a document which can be
fed to the search index.

--------------------
4) Authentication and Authorization

This sprint forwarded the authentication and authorization feature by
building out the integration test framework used to validate the current
and future access-control implementations.

--------------------
5) Large Files

In the course of investigating support for multi-gigabyte datastreams
during Sprint B2 [4], performance bottlenecks were discovered. This sprint
focused on addressing those bottlenecks. When projecting over large content
via the filesystem connector, one of the significant slowdowns results from
the repeated recalculation of the content's checksum. Three solutions to
this issue were submitted for review by and integration into the underlying
ModeShape opensource baseline.

Performance numbers and further detail regarding the large file experiments
can be found on the wiki [10].

--------------------
6) Developer Capacity

In addition to the features extended in this sprint, one of the significant
outcomes was the expansion of the development capacity within the Fedora
community.
Three developers new to the Fedora4 project worked on this sprint from:

- Max Planck Digital Library

Also, the role of running the sprint ("scrum master"), convening the daily
stand-up meetings, and assisting with implementation road blocks was taken
on by Greg Jansen from UNC, Chapel Hill.

=========
References
=========

 [1] https://wiki.duraspace.org/display/FF/Sprint+B3+Summary
 [2] https://wiki.duraspace.org/display/FF/Design+-+Fedora+3+to+4+Upgrade
 [3]
https://wiki.duraspace.org/display/FF/Fedora+3+Object+representation+in+Fedora+4
 [4] https://wiki.duraspace.org/display/FF/Sprint+B2+Summary
 [5] https://github.com/futures/fcrepo4/blob/master/README.md
 [6] https://wiki.duraspace.org/display/FF/Triplestore+Setup
 [7]
https://github.com/futures/fcrepo-jms-indexer-pluggable/blob/master/README.md
 [8] https://wiki.duraspace.org/display/FF/SPARQL+Recipes
 [9]
https://wiki.duraspace.org/display/FF/Design+-+Customizable+Search+Index
[10] https://wiki.duraspace.org/display/FF/Design+-+Large+Files
------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60134071&iu=/4140/ostg.clktrk
_______________________________________________
Fedora-commons-developers mailing list
Fedora-commons-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/fedora-commons-developers

Reply via email to