Mario Ivankovits wrote:
Hi!
The one problem I see with the service API is that if I'm trying to
find metadata for a FileObject, looking in the service API isn't an
obvious thing to do.
But it is the most powerful solution.

I dont want to change/extend the interface for every single thing we can
imagine in the future.
And one should be able to add commands by simply dropping in a jar.

I agree that we shouldn't have individual accessors/mutators for everything that you might want to get
out of a file (i.e. getAuthor, getCreationDate).

What if we had something like this:

MetadataFactory
[org.apache.commons.vfs.metadata]
   + static getMetadata(FileObject file):Map
   + static getKeys(FileObject file):Set
|
| <<uses>>
V

MetadataReaderFactory
[org.apache.commons.vfs.metadata]
   + static getInstanceByMimeType(String mimetype):MetadataReader
   + static getInstanceByExtension(String ext):MetadataReader
   + static getInstance(FileObject obj):MetadataReader

|
| <<creates>>
V

<<MetadataReader>>
[org.apache.commons.vfs.metadata]
   + getMetadata(): Map<String, String>
+ getMetadataKeys():Set<String> -- allows you to see what metadata is available
   + getMimetypes():List<String>

^
|  <<implements>>
|


ImageMetadataReader [org.apache.commons.vfs.metadata.image]
SoundMetadataReader [org.apache.commons.vfs.metadata.sound]
OpenOfficeMetadataReader [org.apache.commons.vfs.metadata.openoffice]
MicrosoftOfficeMetadataReader [org.apache.commons.vfs.metadata.poi]
...

Presumably one could also add writers for these metadata types using a similar set of classes and interfaces.

These classes could invoke services underneath the hood, but I think the metadata API should be high enough in the package structure, and have obvious enough names that people don't have to go hunting. I've found that if users can't find something within 5-10 minutes they figure it's not there, and either give up on the API or write their own. Neither of which we would want them to do.

Most people when they're starting to learn VFS are going to look for
some method in the FileObject (or if they're clever in the
FileContentInfo).  Either of these places are logical places to look
for metadata methods.
But once they stepped into the service API it should be easily
understandable, no?
And as you say, it isnt that a new concept.


The trick is getting them to "step into" the service API to begin with -- it would require them to think of metadata as a service. It's not something that naturally occurs to people to do and so they would probably never think to look in a services package for metadata code. It isn't a new concept; however, its implementation in JAF left a lot to be desired, and was difficult for a lot of people to understand. This is the primary reason that it doesn't really get used a lot. It's still gives me headaches when I look at the doc on it. :-)

An org.apache.commons.vfs.metadata package would be fairly obvious to most people.

Any ideas about how we could make it easier for them?
Docs, Wiki, Mailinglist (in this order, I hope ;-) )

All of which are good. But most people only check them after they haven't been able to find it in the Javadocs under some intuitive package name. :-)

Think about how powerful it could be, given the following three things
share the same base class
Open Office metadata
Microsoft Office metadata
MP3/AAC/Ogg metadata
e.g. DocumentInfo which provides something like (title, author, ...)

one can simply lookup  DocumentInfo.class and get these informations. If
one drop in a jar to extract these data from e.g. java files the code
will use it in the second.

I wont say it isnt possible to do this by extending the API, but I think
it will bloat it.
Is the DocumentInfo some other interface you're thinking of? If so, what's the difference between FileContentInfo and DocumentInfo?

I think most of the "code bloat" would be fairly small. Basically a single new package, and a single method in an interface that returns metadata for specific mimetypes. The actual implementations are simply adapters that implement the interface by making calls to existing APIs capable of reading file metadata. In the case of Open Office, that's a fairly simple matter of looking at the meta.xml file inside the Open Office zip file. For images, there are a couple different ways of getting at this data (either through Drew Noakes' metadata-extractor API (http://www.drewnoakes.com/code/exif/), or through JAI (http://www.picturegrid.com/community/samples/imageio/) and finally POI can extract Microsoft Office document metadata.

Are you anticipating that you'll have some sort of "service discovery " mechanism that will automatically register all services found in the classpath and make them available? If so, then this too would require some work to make it easy for users to use. There would need to be some mechanism for the user to install supporting JARs needed for specific metadata service providers.

I believe that most of what I've outlined though, is so standard and generic that it should be part of the standard VFS distribution rather than available through additional downloads.

I think usually people want and expect everything in a single download, rather than having to make choices about which service providers they want. The existing file system service providers are a good example of this. Right now you have to explicitly download and install additional jars to get the some of functionality that you want. It would be easier, if everything you needed to get started were available in a single download, or with a single Ant "install" target.

Hope this clarifies things a bit. Sorry for the ASCII UML diagram. :-)

Mark

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to