> Would there be any interest in starting a project to > deliver indexed > attribute searching functionality on top of ZFS, akin > to WinFS or Be's > BFS file systems? I've not yet researched the > IP/patent implications > of such functionality, so I'm perfectly willing to > hear that it's a > dangerous notion. > > Are there any internal projects underway with similar > functionality by > chance? Have others considered this and ruled it > out? I can expound > further on my thoughts if anyone is interested. > > Comments?
Are there even any _external_ (Linux, *BSD, proprietary Unixes) applications that take advantage of or allow the user to interact with such capabilities? Yes, there's e.g. Windows XP's Indexing Service, and Mac OS X's Spotlight, and and they probably do represent the bulk of most user's experience/expectations between them. But there are some problems, IMO. Read http://www.nasconf.com/pres04/waidhofer.pdf just to get some ideas about the filesystem itself, first; note the division between lightweight "Extended Attributes" that may be significant to the OS, and basically subfiles or named streams, which are strictly application territory. Then there's the problem of standard names for things: without those, how the heck does everyone agree that a particularly named Extended Attribute contains for example a MIME type, or that a particular named subfile contains a thumbnail (not to mention the preferred format and dimensions of the thumbnail...). Now I think I've seen some talk about intercepting file operations, and there's also the user-space FEM API; either or both might be handy giving an indexer the hooks needed to maintain its index, once it's initially built it. But (aside from perhaps a clearer distinction between what other systems mean by Extended Attributes vs subfiles or named streams), I'm not sure I understand what you what zfs itself to do that it doesn't already. Also, a Really Cool indexing service needs some agreement on a schema for useful metadata naming, as do fancy file managers (and applications that know something of the intent of the files they create), so that the index isn't just a pile of unstructured garbage, but rather something smart enough to find all audio/mpeg files (without resort to extension or magic numbers) and scan their ID3 tags for particular artists; or all ODF files with certain content, or ... Seems to me the _first_ thing that everyone needs to agree on is what name and interface to use to store or retrieve the MIME type in (as proper metadata, not something deduced by filename extension or a typing engine)! But darn if I can find any evidence that there's anything like a standard that far along... This message posted from opensolaris.org _______________________________________________ storage-discuss mailing list [email protected] http://mail.opensolaris.org/mailman/listinfo/storage-discuss
