Nicola Ken Barozzi wrote:...
Why not insert the metadata in the file itself like I proposed? I prefer to keep faith to the 1 file -> one output rule.
I guess our different views here are because of different use cases. You seem to be assuming that you only ever want the meta-data and the content together. That is not the case for me.
Meta-data is often processed independently of actual data. For example, a meta-data harvester is not interested in content. Of course, this would still be possible if it were all in the same file, but performance would suffer.
So it's a technical and not a design issue. I tend to optimize later (and sometimes get bitten ;-)
In some use cases this performance bottleneck will become an issue, for example I have Learning Objects that consist of over 600 pages, each with an average of around 4000 characters. Each page has an additional 1000 characters of meta data (not including XML elements). The nature of XML is that you need to process the whole file even if you only want one element. This means we are processing 5000 characters of data instead of 1000, when harvesting 600 pages that is 300,000 characters of data that we don't want (and we are still ignoring the XML elements).
Well, it's not true, as the stremaing parser can stop processing at any time. This is how we get the doctype.
Furthermore, meta-data tends to be highly structured and would therefore benefit from being stored in a relational database rather than an XML one.
Hmmm, this is an interesting point.
My use case for metadata is title, author, etc. You have a much more complex use case. I now start to understand more of your POV.
Meta-data is often the subject of complex queries and the speed of a relational database is useful here. If the meta-data is stored inside the content file then this can only be done by duplicating the data across two locations. I don't want to do that, I only want to store it once. This wold force the meta-data to be separate from the content.
Further still, it is also common, in some use cases (i.e. editing Learning Objects) for a meta-data editor to be working on the meta-data at the same time as the content author. Having the files separate is useful in the absence of Version Control that does not require technical knowledge (most of my users can barely use an email client).
I understand now.
All that being said, Forrest could be made to support both a separate file or embedded data (there are use cases where the simplistic solution is the best one). The problem with this is that we will have two locations for storing the same data - could be confusing for users.
IMHO the least we have to worry about is to confuse users. I have seen that if there is a simple and a more complete way, users would not get confused.
The only confusion would come out of using both methods at the same time, with clashing metadata values. Ouch!
I'll browse the web for 'rdf in html', 'xhtml metadata' etc to see how this is defined elsewhere. I want to try to reinvent the wheel the least possible.
Also, what is the relationship between skinconf.xml, pdf-output.conf.xml and metadata.xml?
The files in FORREST-INF are the defaults. So skinconf.xml contains the site wide defaults for presentation elements that are core to Forrest. pdf-output.conf.xml contains the defaults for PDF config (button on or off, page size etc.) Metadata.xml contains site wide meta data (generated-by, site title etc.)
Why are skinconf.xml and pdf-output.conf.xml separate?
As with your RT you can then have versions of these files in subsites that will override these defaults within the scope of the subsite.
Yup.
It seems we are in violent agreement on the concept, just need to iron out some smaller details.
--
Nicola Ken Barozzi [EMAIL PROTECTED]
- verba volant, scripta manent -
(discussions get forgotten, just code remains)
---------------------------------------------------------------------