and how is this different from RDF/OWL ?
On Jan 10, 2008 12:06 AM, =JeffH <[EMAIL PROTECTED]> wrote: > Of possible interest.. > > DataPortability.Public.General > Overview - The WRFS (aka WebFS) Inode > http://groups.google.com/group/dataportability-public/web/WRFS%20-%20Web%20Inode%20Overview > > http://dataportability.org/ > > Web Inode Overview > Web Inode Quick Notes > > * Permission "flags" > * visible, read, last access > * Indexes only minimal data, service exists for user, possibly a uri for > type of service > * Able to be queried > * Is pointed to by Attribute Exchange? resides inside AX? > * Data can be public, private > * All features are "Opt in", from the start > * Can point to other web inodes (indirect blocks) > * Each service / data container has access to change its own key-value > pair entry, ie, "exists/doesnt exist" { Add, Update, Remove } > > The case for an inode structure for data on the web > > From a user's perspective the system should: > > * Allow them to use their data in a {web, desktop} application regardless > of where the data is stored (in one place, in multiple locations, etc) > * Not force the user to configure each data store, each application, or > identity provider for each application. It simply manages the permissions for > the user, securely, and allows data { images, video, email etc } aggregation > seamlessly behind the scenes. > * Keep things simple. Let the user do what the user wants to do, as > opposed to trying to keep the user in a "walled garden". Let the market decide > how data should be used. > > As developers the system should: > > * Makes all of a user's data accessible from any application on the > internet while being stored in a variety of containers across many different > sites. > * Makes a user's data accessible from a query language or a proxy-api > * Implicitly knows how to "discover" where a user has data stored. > * Knows how to aggregate a user's data together across multiple > containers > and identities. > * Allows the user to control who can see and query what on their behalf > * Presents a universal data api for web data allowing data to be queried > against as if it were in a single filesystem or database. > > Hey, while we're aiming big, let's go for it -- we might also want it to: > > * Allow an application to view the data from the perspective of a > filesystem and from the perspective of a database > * Potentially act as a "Data Cloud" drive on a portable device (example: > an iPhone thinks its I: drive is a local disk, but its actually the system we > are proposing) > > So exactly what are you saying here? > > Basically, the WRFS Inode allows us to view, query, and aggregate a user's > data, regardless of location (restriction: data must be accessible through a > webserver) basically in the same way that we do with a local disk based > filesystem (at least in most ways). Say, how does a filesystem work, anyway? > That sounds like a good place to go for a start on our model! > Goals: > > * Store > * Aggregate > * Protect > * Relate > * Query > > Our data? Huh. What other systems do these same things? > > * A Filesystem > * A Database > * DNS > > So really we want to do some things that have already been done quite well in > computer science. So Let's take a look at how they do it, and build a > roadmap/model of how we might create an abstraction. > The Filesystem as a Metaphor > > What are some interesting properties of a filesystem that are very applicable > to our situation? > > * A filesystem abstracts away the details of storing bits on a storage > system from the user or programmer so they can focus on working with files, > as, > well "files". The programmer simply says "I want a stream to read from a file > called "foo.txt" in my "/usr" directory, go get it and return me a data > structure or stream". The programmer doesn't worry about inodes, bmap, or the > size of a disk block (on average), because at the application level of the > abstraction model those underlying details should simply be "taken care of". > Just think about this --- if you had to worry about managing free disk blocks > in a linked list everytime you wanted to open a text file, you might start > thinking about changing professions. Abstractions are your friend. > * So exactly what happens when we open a file to read? A file is stored > all in one spot, right? Not quite. A Hard drive has a disk that is a series of > "disk blocks", which are all the same size (generally 4KB), which is > considerably smaller than your average mp3 file. So how does a file get read > from disk if its store in all those 4KB chunks? It goes roughly like: > 1. --- Translation of filename and directory to inode --- > 2. --- Translation of inode and offset into disk block using the > "bmap" system call --- > 3. --- file reader/writer is returned at offset in block --- > Where: > o Inode - a data structure on disk that represents a file and has > pointers to all of the disk blocks on disk that contain the actual bytes for > the file. > * So now you say "great, you've told us how a basic filesystem works, but > that doesnt help flesh out this grand distribute file system..." --- and to > that I say "hold on, lemme finish, I'm going somewhere with this". What > property of the filesystem can we apply to our design goal abstraction? > > Store and Aggregate. > > We have to be able to store our data in whatever container we want on > the internet, and we need to be able to aggregate that data back together > again, right? Well, what if we said: > o The concept of a disk block could be related to a web server > itself to satisfy the storage requirement. The system call Bmap() is mapped to > a standard web api { soap, rest, json } on the webserver for exposing data. > o The concept of an inode could be related to a data index store to > satisfy the aggregation requirement. This online store would be needed to tell > a program/agent "hey, userID X has images in flickr, smugmug, and myspace", > which would then take those uris and make the proper data queries via the > standard web * > o data apis on the respective data containers. This eliminates the > need to actually cache someone's data in a third party site, the data should > be > able to be "discovered" and aggregated at runtime. The data aggregation layer > might do something like take an openID identifier and query a data index store > (possibly referenced by the openID provider itself) to get the list of data > containers, query each one, and then return a data structure (or recordset) of > relevant data stubs+uris for specific user files to the application layer. > > [table elided] > > The Database as a Metaphor > > I think really the aspects of a database that are interesting in this context > are Protect, Relate, and Query. > > * Protect - The data stubs returned from the data containers might not be > accessible from all applications or third parties. A database can set who can > view and update a table down to the record level. Our system might want that > level of granularity. > * Relate - The data stubs returned from the data containers might have > relational properties and have transformations performed against them before > being returned to the application layer. > * Query - just like how a SELECT statement is parsed into a query tree > before being executed against a database table/view, we might want to execute > similar type functions/filters against the returned aggregated data. > > DNS as a Metaphor > > DNS allows us to take a domain name and translate it into an IP address; This > is interesting from the standpoint of our need to resolve an openID-like token > into a set of data container uris for a given data type { social graph, > images, > videos, ? } > > --- > end > > _______________________________________________ > General mailing list > [email protected] > http://simile.mit.edu/mailman/listinfo/general > _______________________________________________ General mailing list [email protected] http://simile.mit.edu/mailman/listinfo/general
