Vincent Massol wrote:
> Hi,
>
> In the new rendering code I need to call some code that transforms
> [[[wiki:][Space.]][Doc]] into a link. I'm proposing to introduce 2 new
> classes/components in Core:
>
> * DocumentName: Represents a Document's Name. It'll have 3 properties:
> - String wiki
> - String space
> - String page
See below.
> * DocumentNameFactory: Create a DocumentName from a string
> representing a Document's name. Transforms [[[wiki:][Space.]][Doc]]
> into a DocumentName object.
See below.
> * The DocumentNameFactory would depend on the Execution component so
> that it can use the current wiki, current space and current document
> if these are not specified.
+1
> * This raises the question as to whether we should continue passing a
> String representing a document name in our APIs in the future or
> instead pass a DocumentName. I'm not yet sure what is the best answer
> to this...
DocumentName whenever possible, but also allow Strings for backwards
compatibility, and for easy
access from scripts. At leas for the moment, maybe later we can drop strings,
if we see that working
with DocumentName-s is good, simple and easy.
> * Other question: In the Document object do we store the DocumentName
> object or do we store instead only Space and Wiki objects? If it's the
> latter then we need to fetch them from the DB which takes time. We
> could also decide to only fetch them when requested with getSpace()
> and getWiki() (i.e. lazy loading).
I don't know why we need to store wiki objects. As far as things are now, wikis
don't share the same
database/schema. Sure, the document should be able to access the wiki it
belongs to, but I see no
need for a persistent relationship between the two. A reference to the wiki
object can be added when
creating the Document object.
As for spaces, right now I think that first we must define what a space is (or
will be), and then
see if it makes sense to make the link between documents and spaces, and if
this link should be
persisted in the Document object.
> * BTW this also raises the question as to whether we want to have a
> representation for space and wiki or not and instead only use tags, in
> which case a document name is simply a String like "mypage". But then
> it should be unique. So it could also be made of a list of identity
> tags as in: "space=sp1,sp2:wiki:wiki1:language=fr:mypage". Or we could
> standardize it as "wiki1:sp1,sp2:fr:mypage" and have the
> DocumentNameFactory transform it into tags. In that case the
> DocumentName object would be a Map of tags + the document name
> ("mypage"). I think we need to decide ASAP if we want to keep the
> strict and hardcoded notion of Wiki>Space>Document>Object>Property or
> instead go full tags since this changes completely the v2 interfaces
> and code we're writing.
There have been many posts on the folksonomy vs ontology, tags vs hierarchies,
loose semantics vs
rigid semantics debates. So far, neither is winning (at least not on all
points, and not for everybody).
My take is that tags have the advantage that they are much more flexible and
sometimes better at
organizing data, but hierarchies are needed, too. We cannot get rid of spaces.
A lot of users
require them (and require even deeper hierarchies). A lot of our features and
strong points come
from here, although these features could be mostly reworked to be based on tags.
So, I think that we should put more power in tags. And we should keep spaces.
And add hierarchical
spaces, too. But we should change the way spaces and documents work.
The major problem with the current way spaces are implemented is that there is
a strong link between
spaces, document IDs, URLs and the whole platform code. This is wrong. URLs
should be a way to
access the wiki, and not a strict, unique reference to documents. Spaces should
be a way to organize
documents, not a major part of the document definition. Like in a FS, a file is
NOT defined by the
directory it resides in. You can move files around without changing the way the
work or the data
they contain. We should do the same. To go further, documents are not at all
dependent on the
document name, either. In a modern FS, a document can have several names in
several places, as
_hard_ and symbolic links.
One of the best things about XWiki (and in general of wikis as opposed to CMSs)
is that documents
have names, and not just numbers. But XWiki went too far with this, by using
the names as internal
identifiers. Confluence got it right, and internal IDs are unique identifiers,
but pretty names are
displayed/used by the users.
- Spaces should not be a part of documents, but a "feature", or "property" of
them.
- A document should have _at least_ one access name
- The way URLs identify documents should be pluggable. We kind of have this in
one direction, with
the URL factories, but we don't have it the other way around. The XWiki giant
class has only one
method for finding a document, given a URL.
- We should also have pluggable document identifier components. For example,
the language field
should not be hardcoded in the Document class, but another optional feature of
documents. The space
feature as well.
- When retrieving documents from the database, retrieve not an exactly
identified document, but one
that best matches a set of criterias. For example, retrieve a document that has
the "name"
"WebPreferences" (matches 172 documents); and which has the "space"
"Documents.Media.Music" (matches
3 documents); and which has the "language" "en" (matches 1 document). All these
should be optional
document properties. The only required document property is the unique ID.
- All these document identification features could (should?) be components (as
in Plexus/IoC
components).
Now, back to the DocumentName, it should not have a strong fixed type (wiki,
name, space and
language), but a loose collection of features. It should be able to have a
constructor that
interprets one map-like string (like Vincent proposed), a constructor that can
interpret old-style
document names, a constructor that receives a map (string -> string, feature
name -> value), and a
plain constructor, with the features being set later with
identifier.set("property", "value").
Since these are loose features, we can allow a Document object to have two
names, or three spaces.
Do we want to link features between them? Like, 2 names x 3 spaces = 6 classic
identifiers, or
should we be able to say the name N1 is only valid in space S1, and N2 is valid
in S2 and S3.
And it should not be called DocumentName. Maybe DocumentIdentifier is better?
I'm too sleepy right
now to come up with a good name.
Back to the DocumentNameFactory, as stated above, we should have several such
factories, each one
able to construct a document identifier from some specific data.
ServletURLDocumentIndentifierFactory accepts URLs as used in a servlet-based
wiki,
PortletURLDocumentIdentifierFactory accepts URLs as used in a portlet wiki,
HierarchicalServletURLDocumentIndentifierFactory also accepts hierarchies, or
extended spaces,
XmlRpcDocumentIdentifierFactory works with XmlRpc, and so on.
One thing we must keep in mind is that XWiki also uses URLs to identify
attachments and files inside
attachments, and skin files located in the FS. Thus, I'm not sure these
factories should return
document identifiers, or a more general resource identifier (this would allow
to identify objects
and properties, too, or even the generic fragment identifiers Stephane used in
http://arkub.net/xwiki/bin/Blog/Farewell_SMTP).
On the tags vs. document features, I see them as related, but different in one
essential point:
features are typed tags. Now, do we want them to share the same storage
mechanism, and the same way
to access them from the Document object? Should they be stored together, with
normal tags as
features with no type, or with features as tags with a special syntax?
--
Sergiu Dumitriu
http://purl.org/net/sergiu/
_______________________________________________
devs mailing list
[email protected]
http://lists.xwiki.org/mailman/listinfo/devs