Re: Documents and Schemas

Tad Glines Thu, 25 Feb 2010 13:36:47 -0800

I think that wave based schemas could work, and would obviate the need to
have a hard coded association between document names/prefixes and schemas.
However, there will need to be a mechanism for propagation of the schema
wave along with the content wave. For example, if the schema is on server A
and the new wave is on server B and one of the participants in on server C,
how would server C obtain the contents of the schema wave on server A? One
possible solution is to do the following:

   1. Allow a server to be a participant in a wavelet. If a participant ID
   has a domain, but no name, then it would be treated as the identity of the
   server itself. An alternative is to use "*" as the name.
   2. Have an assumed access edge of "*...@server" -> (READ, ADD_ME) -> "*...@*"
   3. Any wave that has "*...@*" as a participant is treated as globaly
   readable, but not writable.
   4. Any remote server could submit a ProtocolWaveletDelta with a
   hashed_version set to version 0 that contains a single operation adding
   itself as a participant to a global wave. Since the only allowed operation
   in this case is an AddParticipant, the transformation is a special case and
   doesn't require full history retrieval and transformation.
   5. If the AddParticipant is allowed then that server would then be able
   to make a history request and would get future updates.

An alternative is to simple allow any remote server to perform a history
request with an unbounded end version for a global wave. The "server
identity" concept would also make it easier to propagate access edges, one
per wave instead of one per participant.

Since the content wave would also need to specify the version of the schema
wave and not just it's identity, the simple history request approach might
be sufficient. In that case each time the remote server encounters a content
wave referencing a newer version of the schema wave, it would request the
missing version from the schema wave's host.

The association between a wavelet and it's schema(s) could be handled by a
single "schema" docuemnt that contains a list of mappings documents, and
each mapping document would contain the details. For example:
The "schema" document could contain:
<schema_list>
  <schema name="conversation" wavelet_name="schema!
google.com/conversation!google.com" wavelet_version="23:23FDE98AB54E"
document_id="schema+conversation"/>
</schema_list>

The "schema+conversation" document could contain:
<schema_mappings for="conversation">
  <schema_mapping name="conversation" schema_document="conversation"
document_prefix="conversation"/>
  <schema_mapping name="blip" schema_document="blip" document_prefix="b+"/>
</schema_mappings>

The alternative is to require each document to have a root element that
contains the schema mapping in attributes. This would require more indexing
work and not allow a document to have more than one root element.

-Tad

On Thu, Feb 25, 2010 at 1:42 AM, Alex North <[email protected]> wrote:

> Hi Tad,
>
> Thanks for raising this matter. We have indeed considered this problem and
> many of the options that have arisen in this thread.
>
> The wavesandbox wave server already enforces schemas. We plan to continue
> with this approach. The current schemas are hard-coded, not very flexible,
> and based on the document namespace (e.g. b+ -> blip schema). We're working
> on a better approach. Our current plan is for schemas to be stored in
> wavelets, written in a schema language, and referenced by the documents to
> which they apply. The reference would include a version and signature of the
> schema wavelet.
>
> We're still thinking about ways to upgrade the schema of a document.
> Suggestions welcome! The problems pointed out above are tricky indeed.
>
> Enforcing schemas at the server is not prohibitively expensive, and well
> worth it for the guarantees that can be provided about the data.
>
> Cheers,
> Alex
>
>
> On 17 February 2010 15:49, Tad Glines <[email protected]> wrote:
>
>> There seems to be three possible ways to enforce a schema in wave:
>> 1. Play nice: Clients are expected to play nice together.
>> 2. Server enforced: The server (specifically the federation host)
>> enforces the schema.
>> 3. Agent enforced: Extend the existing agent interface so that the
>> server can pass an transformed but unapplied delta to a "schema" agent
>> for verification.
>>
>> The play nice has fundamental problems with trust and consistency. All
>> it takes is one rogue client (accidental or intentional) and many a
>> wave go "shiny".
>>
>> The server enforced model can work, but there's a question of how the
>> server would know what schema to enforce on which documents (or set of
>> documents). For some common schemas a fixed namespace to schema
>> mapping might work. But for less well know clients, this could become
>> problematic. Defining a schema in a wave might work, but only if there
>> was a way to assign that "schema" wave to the "content" wave and
>> ensure that the "schema" wave was propagated along with the "content"
>> wave. The existing wave mechanisms don't support this.
>>
>> The agent approach has the potential to make things a little simpler.
>> We already have the concept of agents that perform actions within a
>> wave (spelly for example). And some robots being developed are
>> performing content control type actions. And users are becoming
>> familiar with the idea of adding a robot/agent to their wave to
>> enhance it. So perhaps we can extend this "plug-able capability"
>> concept to include the addition of content enforcement capabilities to
>> agents. So, for example, when I create a "conversation" wave the
>> "conversation" agent is added as a non-removable participant to the
>> wave and acts as the schema enforcement agent. All deltas would be
>> passed to the agent prior to being applied. The agent would have the
>> option of requesting additional content from the wave before
>> approving/disproving the delta. This might also work with robots, but
>> has the potential to cause a serious bottleneck if the robot isn't
>> implemented/provisioned appropriately.
>>
>> -Tad
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "Wave Protocol" group.
>> To post to this group, send email to [email protected].
>> To unsubscribe from this group, send email to
>> [email protected]<wave-protocol%[email protected]>
>> .
>> For more options, visit this group at
>> http://groups.google.com/group/wave-protocol?hl=en.
>>
>>
>  --
> You received this message because you are subscribed to the Google Groups
> "Wave Protocol" group.
> To post to this group, send email to [email protected].
> To unsubscribe from this group, send email to
> [email protected]<wave-protocol%[email protected]>
> .
> For more options, visit this group at
> http://groups.google.com/group/wave-protocol?hl=en.
>

-- 
You received this message because you are subscribed to the Google Groups "Wave 
Protocol" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/wave-protocol?hl=en.

Re: Documents and Schemas

Reply via email to