Re: [xwiki-devs] [Proposal] New DocumentName and DocumentNameFactory component + open questions

Sergiu Dumitriu Tue, 24 Jun 2008 00:46:12 -0700

Vincent Massol wrote:
> Hi,
> 
> In the new rendering code I need to call some code that transforms  
> [[[wiki:][Space.]][Doc]] into a link. I'm proposing to introduce 2 new  
> classes/components in Core:
> 
> * DocumentName: Represents a Document's Name. It'll have 3 properties:
>    - String wiki
>    - String space
>    - String page


See below.

> * DocumentNameFactory: Create a DocumentName from a string  
> representing a Document's name. Transforms [[[wiki:][Space.]][Doc]]  
> into a DocumentName object.

See below.

> * The DocumentNameFactory would depend on the Execution component so  
> that it can use the current wiki, current space and current document  
> if these are not specified.

+1

> * This raises the question as to whether we should continue passing a  
> String representing a document name in our APIs in the future or  
> instead pass a DocumentName. I'm not yet sure what is the best answer  
> to this...

DocumentName whenever possible, but also allow Strings for backwards 
compatibility, and for easy 
access from scripts. At leas for the moment, maybe later we can drop strings, 
if we see that working 
with DocumentName-s is good, simple and easy.

> * Other question: In the Document object do we store the DocumentName  
> object or do we store instead only Space and Wiki objects? If it's the  
> latter then we need to fetch them from the DB which takes time. We  
> could also decide to only fetch them when requested with getSpace()  
> and getWiki() (i.e. lazy loading).

I don't know why we need to store wiki objects. As far as things are now, wikis 
don't share the same 
database/schema. Sure, the document should be able to access the wiki it 
belongs to, but I see no 
need for a persistent relationship between the two. A reference to the wiki 
object can be added when 
creating the Document object.

As for spaces, right now I think that first we must define what a space is (or 
will be), and then 
see if it makes sense to make the link between documents and spaces, and if 
this link should be 
persisted in the Document object.

> * BTW this also raises the question as to whether we want to have a  
> representation for space and wiki or not and instead only use tags, in  
> which case a document name is simply a String like "mypage". But then  
> it should be unique. So it could also be made of a list of identity  
> tags as in: "space=sp1,sp2:wiki:wiki1:language=fr:mypage". Or we could  
> standardize it as "wiki1:sp1,sp2:fr:mypage" and have the  
> DocumentNameFactory transform it into tags. In that case the  
> DocumentName object would be a Map of tags + the document name  
> ("mypage"). I think we need to decide ASAP if we want to keep the  
> strict and hardcoded notion of Wiki>Space>Document>Object>Property or  
> instead go full tags since this changes completely the v2 interfaces  
> and code we're writing.

There have been many posts on the folksonomy vs ontology, tags vs hierarchies, 
loose semantics vs 
rigid semantics debates. So far, neither is winning (at least not on all 
points, and not for everybody).

My take is that tags have the advantage that they are much more flexible and 
sometimes better at 
organizing data, but hierarchies are needed, too. We cannot get rid of spaces. 
A lot of users 
require them (and require even deeper hierarchies). A lot of our features and 
strong points come 
from here, although these features could be mostly reworked to be based on tags.

So, I think that we should put more power in tags. And we should keep spaces. 
And add hierarchical 
spaces, too. But we should change the way spaces and documents work.

The major problem with the current way spaces are implemented is that there is 
a strong link between 
spaces, document IDs, URLs and the whole platform code. This is wrong. URLs 
should be a way to 
access the wiki, and not a strict, unique reference to documents. Spaces should 
be a way to organize 
documents, not a major part of the document definition. Like in a FS, a file is 
NOT defined by the 
directory it resides in. You can move files around without changing the way the 
work or the data 
they contain. We should do the same. To go further, documents are not at all 
dependent on the 
document name, either. In a modern FS, a document can have several names in 
several places, as 
_hard_ and symbolic links.

One of the best things about XWiki (and in general of wikis as opposed to CMSs) 
is that documents 
have names, and not just numbers. But XWiki went too far with this, by using 
the names as internal 
identifiers. Confluence got it right, and internal IDs are unique identifiers, 
but pretty names are 
displayed/used by the users.

- Spaces should not be a part of documents, but a "feature", or "property" of 
them.
- A document should have _at least_ one access name
- The way URLs identify documents should be pluggable. We kind of have this in 
one direction, with 
the URL factories, but we don't have it the other way around. The XWiki giant 
class has only one 
method for finding a document, given a URL.
- We should also have pluggable document identifier components. For example, 
the language field 
should not be hardcoded in the Document class, but another optional feature of 
documents. The space 
feature as well.
- When retrieving documents from the database, retrieve not an exactly 
identified document, but one 
that best matches a set of criterias. For example, retrieve a document that has 
the "name" 
"WebPreferences" (matches 172 documents); and which has the "space" 
"Documents.Media.Music" (matches 
3 documents); and which has the "language" "en" (matches 1 document). All these 
should be optional 
document properties. The only required document property is the unique ID.
- All these document identification features could (should?) be components (as 
in Plexus/IoC 
components).


Now, back to the DocumentName, it should not have a strong fixed type (wiki, 
name, space and 
language), but a loose collection of features. It should be able to have a 
constructor that 
interprets one map-like string (like Vincent proposed), a constructor that can 
interpret old-style 
document names, a constructor that receives a map (string -> string, feature 
name -> value), and a 
plain constructor, with the features being set later with 
identifier.set("property", "value").

Since these are loose features, we can allow a Document object to have two 
names, or three spaces. 
Do we want to link features between them? Like, 2 names x 3 spaces = 6 classic 
identifiers, or 
should we be able to say the name N1 is only valid in space S1, and N2 is valid 
in S2 and S3.

And it should not be called DocumentName. Maybe DocumentIdentifier is better? 
I'm too sleepy right 
now to come up with a good name.


Back to the DocumentNameFactory, as stated above, we should have several such 
factories, each one 
able to construct a document identifier from some specific data. 
ServletURLDocumentIndentifierFactory accepts URLs as used in a servlet-based 
wiki, 
PortletURLDocumentIdentifierFactory accepts URLs as used in a portlet wiki, 
HierarchicalServletURLDocumentIndentifierFactory also accepts hierarchies, or 
extended spaces, 
XmlRpcDocumentIdentifierFactory works with XmlRpc, and so on.

One thing we must keep in mind is that XWiki also uses URLs to identify 
attachments and files inside 
attachments, and skin files located in the FS. Thus, I'm not sure these 
factories should return 
document identifiers, or a more general resource identifier (this would allow 
to identify objects 
and properties, too, or even the generic fragment identifiers Stephane used in 
http://arkub.net/xwiki/bin/Blog/Farewell_SMTP).


On the tags vs. document features, I see them as related, but different in one 
essential point: 
features are typed tags. Now, do we want them to share the same storage 
mechanism, and the same way 
to access them from the Document object? Should they be stored together, with 
normal tags as 
features with no type, or with features as tags with a special syntax?
-- 
Sergiu Dumitriu
http://purl.org/net/sergiu/
_______________________________________________
devs mailing list
[email protected]
http://lists.xwiki.org/mailman/listinfo/devs

Re: [xwiki-devs] [Proposal] New DocumentName and DocumentNameFactory component + open questions

Reply via email to