pkeane wrote:
I wonder if I might get feedback on this question:
I am in the process of refactoring a web-based data repository (see
daseproject.org), moving from PHP4 to PHP5 among other things, but also
moving to a more fully realized service-based architecture. The
repository holds many collections of digital assets, primarily
images,but also audio, video, pdfs, etc.
The repository is used by faculty at BUT Austin to
store/organize/share/present digital 'stuff' for both instructional and
research purposes. When a faculty member creates a new collection they
choose the set of attributes (i.e. "fields") they would like to use to
describe their items. Since these are often existing collections being
ported from FileMaker, Access, etc. it is quite efficient to be able to
use the same attributes. Thus a collection is simply a set of "items"
described by arbitrary key-value pairs with an associated media file (or
files if there are different sizes, say, for an image, or an mp3 AND wav
file for audio).
I understand the benefit of using Atom for outward facing services,
apis, etc., but does Atom offer benefits for internal messaging, data
transfer, AJAX, etc? I suspect yes, but I cannot get past the fact that
my ultra-generic xml schema is REALLY easy to deal with:
[note that there are only TWO possible child elements for an item (think
atom 'entry'): metadata AND media_file]
<item serial_number="000435213">
<metadata
ascii_id="admin_checksum">630230b057c511cbee87447960fff02e</metadata>
<metadata ascii_id="admin_filename">62-GT-06.jpg</metadata>
<metadata ascii_id="admin_file_size">318642</metadata>
<metadata ascii_id="admin_image_height">576</metadata>
<metadata ascii_id="admin_image_width">720</metadata>
<metadata ascii_id="admin_mime_type">image/jpeg</metadata>
<metadata ascii_id="admin_serial_number">000435213</metadata>
<metadata ascii_id="admin_upload_date_time">2007-07-18T15:59:28</metadata>
<metadata ascii_id="credit">Photographer: Unknown</metadata>
<metadata ascii_id="dase_rights">Restricted</metadata>
<metadata ascii_id="description">photo of Ben Barnes while Speaker,
black and white</metadata>
<metadata ascii_id="keyword">Ben Barnes</metadata>
<metadata ascii_id="keyword">Capitol Building interior</metadata>
<metadata ascii_id="keyword">Lieutenant Governor</metadata>
<metadata ascii_id="keyword">Speaker of the House</metadata>
<metadata ascii_id="original_filename">62-GT-06.jpg</metadata>
<metadata ascii_id="rights_owner">Senate Media Services</metadata>
<metadata ascii_id="rights_status">Use in Texas Politics content</metadata>
<metadata ascii_id="scratch_pad">/62-GT-06.jpg</metadata>
<metadata ascii_id="title">Ben Barnes</metadata>
<metadata ascii_id="used_in_chapter">none</metadata>
<media_file filename="000435213_800.jpg" size="medium" height="576"
width="720" mime_type="image/jpeg" />
<media_file filename="000435213_100.jpg" size="thumbnail" height="80"
width="100" mime_type="image/jpeg" />
<media_file filename="000435213_640.jpg" size="small" height="480"
width="600" mime_type="image/jpeg" />
</item>
Any thoughts on the benefits of using atom here? While I realize the
wisdom of NOT reinventing wheels
(http://www.tbray.org/ongoing/When/200x/2006/01/08/No-New-XML-Languages)
I also aim for maximum simplicity.
This looks like it will drop into Atom. Your main reason right now,
would be integrate with a globally available toolchain so you can
interchange information. Maximum simplicity is not defining yet another
format you have to persuade others to support. For example rights_owner
could be made go away in favour of atom:rights, dublin core or cc.
And I don't agree your ultra-generic schema will be be easy to deal
with; ultra-genric things tend to be more complicated than expected.
<metadata ascii_id="title"><h3>Ben & Jerry</h3></metadata>
what happens there?
cheers
Bill