Yup, I am trying to decide if the tool support is enough to justify the effort. And still I wonder if there is some other potential side benefit am not seeing. Here, by the way, is a collection as Atom Feed (with only one item shown). Note that collection owners can declare their own custom attributes to "map" to Atom Elements if they wish, in which case they appear in the default Atom namespace, otherwise they are in the "dase" namespace (standard administrative metadata common to all collections) or in the collection's own namespace.

Note that there is no hand coding here, just a method on a collection object, e.g "print $collection->asAtom()".

-pk

<?xml version="1.0" encoding="UTF-8"?>
<feed xmlns="http://www.w3.org/2005/Atom";
  xmlns:dase="http://quickdraw.laits.utexas.edu/dase";
  
xmlns:texpol="http://quickdraw.laits.utexas.edu/dase/texpol_image_collection/1.0";
  xml:base="http://quickdraw.laits.utexas.edu/dase/texpol_image_collection/";>
 <title>Texas Politics Image Collection</title>
 <id>http://quickdraw.laits.utexas.edu/dase/texpol_image_collection</id>
 <author>
  <name/>
 </author>
 <updated>1969-12-31T18:00:00-06:00</updated>
 <entry 
xml:base="http://quickdraw.laits.utexas.edu/dase/texpol_image_collection/";>
  
<id>http://quickdraw.laits.utexas.edu/dase/texpol_image_collection/000435205</id>
  <updated>1969-12-31T18:00:00-06:00</updated>
  <title>Congress Avenue</title>
  <summary>photo of Congress Avenue</summary>
  <dase:admin_mime_type>image/jpeg</dase:admin_mime_type>
  <dase:admin_filename>PICA17902.JPG</dase:admin_filename>
  <dase:admin_checksum>c837f0abd05c8b7126b8dac15d510f30</dase:admin_checksum>
  <dase:admin_file_size>787705</dase:admin_file_size>
  <dase:admin_image_width>1408</dase:admin_image_width>
  <dase:admin_upload_date_time>2007-07-18T15:59:19</dase:admin_upload_date_time>
  <dase:admin_serial_number>000435205</dase:admin_serial_number>
  <dase:admin_image_height>1209</dase:admin_image_height>
  <texpol:keyword>Congress Avenue</texpol:keyword>
  <texpol:keyword>buildings</texpol:keyword>
  <texpol:scratch_pad>/PICA17902.JPG</texpol:scratch_pad>
  <texpol:rights_owner>Austin History Center</texpol:rights_owner>
  <texpol:rights_status>Use in Texas Politics content</texpol:rights_status>
  <texpol:credit>Photographer: Unknown</texpol:credit>
  <texpol:dase_rights>Restricted</texpol:dase_rights>
  <texpol:original_filename>PICA17902.JPG</texpol:original_filename>
  <texpol:used_in_chapter>executive</texpol:used_in_chapter>
  <link length="9701" type="image/jpeg" 
rel="http://quickdraw.laits.utexas.edu/dase/media/thumbnail"; 
href="/media/thumbnail/000435205_100.jpg"/>
  <link length="78505" type="image/jpeg" 
rel="http://quickdraw.laits.utexas.edu/dase/media/viewitem"; href="/media/viewitem/000435205_400.jpg"/>
  <link length="783340" type="image/jpeg" 
rel="http://quickdraw.laits.utexas.edu/dase/media/full"; href="/media/full/000435205_3600.jpg"/>
  <content 
src="http://quickdraw.laits.utexas.edu/dase/texpol_image_collection/media/thumbnail/000435205_100.jpg";
 type="image/jpeg"/>
 </entry>
 </feed>


On Fri, 5 Oct 2007, James M Snell wrote:

Basically, if it's a closed system with specific clients, there likely
will not be any benefit to using Atom.  If you wish to enable
interchange and interop with other applications, there will be benefits
to using Atom, if only to leverage the existing tool support.

- James

pkeane wrote:

Yes, it is really nothing more than key-value pairs.  I am more
wondering about the possible benefits of Atom than whether this system
works -- I use it for data import/export of the collections and it is
quite easy to create parsers and generators for this format that lets me
move it in and out of the relational databse that the application uses.

The database itself is also quite generic: a "collections" table, and
"items" table, a "values" table and an "attributes" table (each value
has an item_id and and attribute_id).  It is important that the data
model be able to grow organically -- as a user adds a new "field" (aka
key or attribute) to describe the items they have, they'll have no
knowledge at all of Atom or Dublin Core or any of that.  And it's fine
-- every collection has a unique set of attributes (aka fields or
keys).  The composite primary key for attribute is "ascii_id" plus
"collection_id".

The system has been in production and heavily used for a couple years,
and includes 88 collections comprising 300,000 items.  The are currently
1358 rows in the attribute table (those are the keys in the key->value
pairs) and 4.5 million rows in the value table.  We've had no problems
at all with this current architecture.  And yet I wonder what Atom could
do for me as a more standard XML format for data serialization...

thanks!
Peter Keane
daseproject.org




On Sat, 6 Oct 2007, A. Pagaltzis wrote:


* pkeane <[EMAIL PROTECTED]> [2007-10-05 07:00]:
<item serial_number="000435213">
<metadata
ascii_id="admin_checksum">630230b057c511cbee87447960fff02e</metadata>
<metadata ascii_id="admin_filename">62-GT-06.jpg</metadata>
<metadata ascii_id="admin_file_size">318642</metadata>
<metadata ascii_id="admin_image_height">576</metadata>
<metadata ascii_id="admin_image_width">720</metadata>
<metadata ascii_id="admin_mime_type">image/jpeg</metadata>
<metadata ascii_id="admin_serial_number">000435213</metadata>
<metadata
ascii_id="admin_upload_date_time">2007-07-18T15:59:28</metadata>
<metadata ascii_id="credit">Photographer: Unknown</metadata>
<metadata ascii_id="dase_rights">Restricted</metadata>
<metadata ascii_id="description">photo of Ben Barnes while Speaker,
black
and white</metadata>
<metadata ascii_id="keyword">Ben Barnes</metadata>
<metadata ascii_id="keyword">Capitol Building interior</metadata>
<metadata ascii_id="keyword">Lieutenant Governor</metadata>
<metadata ascii_id="keyword">Speaker of the House</metadata>
<metadata ascii_id="original_filename">62-GT-06.jpg</metadata>
<metadata ascii_id="rights_owner">Senate Media Services</metadata>
<metadata ascii_id="rights_status">Use in Texas Politics
content</metadata>
<metadata ascii_id="scratch_pad">/62-GT-06.jpg</metadata>
<metadata ascii_id="title">Ben Barnes</metadata>
<metadata ascii_id="used_in_chapter">none</metadata>
<media_file filename="000435213_800.jpg" size="medium" height="576"
width="720" mime_type="image/jpeg" />
<media_file filename="000435213_100.jpg" size="thumbnail" height="80"
width="100" mime_type="image/jpeg" />
<media_file filename="000435213_640.jpg" size="small" height="480"
width="600" mime_type="image/jpeg" />
</item>

Any thoughts on the benefits of using atom here?

I don?t see the problem. Atom gives you an Entry where you can
put the metadata for a media resource. You have a bunch of
attributes that should be mapped to Atom elements; the rest you
stick into the content, possibly as RDF since your ad-hoc vocab
is more or less along those lines anyway.

I cannot get past the fact that my ultra-generic xml schema is
REALLY easy to deal with

It?s not actually very generic. It?s a very limited vocabulary
that expresses barely any more than a map of key-value pairs. Of
course such a simple data structure is easy to deal with. The
only genericity there is that the keys are arbitrary strings. It
looks easy now because you have to do almost no work up front:
the structure is rigid and the semantics are completely ad-hoc.

It won?t look very easy at all once you have a large dataset with
an inconsistent mess of key names.

Regards,
--
Aristotle Pagaltzis // <http://plasmasturm.org/>





Reply via email to