Stefano Mazzocchi wrote:

Reinhard Poetz wrote:


<snip/>

good point.
I also want that every comment (not only doc changes) has to be approved by a committer. So we have a double-barrier (hope that's understandable English) for spamming bots.


Ok, captchas + human moderation is clearly too high of a barrier for spammers and even for defacers. Even infra@ would not have a problem with that.


There's an interesting chapter on circumventing captchas at wikipedia [1]. Are we "interesting enough" in terms of google ranking to attract such things?

Two question to Stefano (and everybody else): I proposed numbers as document IDs? What do you think about this?


I used to be a fanatic of 'readable URL'... but I think they present more problems than they solve.

First of all, the encoding is a pain. It's fine for english, but until we ave IRI (internationalized resource identifiers, think "unicode meets URI") support forget chinese, japanese, cyrillic, hebrew, korean and so on.

One normaly solution is to have an english title even for non-english pages. I dislike that, it's very anglo-centric.


Well, consider the state of Cocoon, the ASF, the opensource world and the whole IT industry: they're all anglo-centric. Would you have the same concerns if this was esperanto or interlingua rather than english (or more precisely "international english")?

Furthermore, translations must follow the original reference docs, which is the english one. So having all language-specific resources use the same name as their english counterpart isn't a problem to me.

Second, people like Nielsen argue that readable URLs are easier to use and to remember. I think it's bullshit. Not even my bookmarks satisfy me anymore in terms of link management (del.icio.us + google killed my browser bookmarks), do you really think I would type in or remember any URL today? nonsense.


I do remember a lot of URLs, provided of course that they are meaningful. And I have a very powerful tool to help me crawling in this tree of URLs that I know: the Firefox address bar autocompletion (which BTW just a reuse of the unix command-line behaviour).

And the more you use a URL, the more it engraves into your mind. Nothing new in the cognition area here, but that means that a lot of regular users of Cocoon know the URLs space of its documentation by heart, or at least the main directory names.

There are a few values of a readable URL. The first is actionable breadcrumbs.


Breadcrumbs should better be generated from the navigational structure rather than the page path, even if both often match.

So, if you find yourself in

  http://site.com/a/b/c/d/e/f/g

you can automatically infer something like

  site.com > a > b > c > d > e > f > g

and, for shitty web sites, that is a *tremendous* navigation help. For URLs like

  http://site.com/page/39884984

that's it, there is no hierarchical context that you can infer from it.

Now, we will not have a shitty web site, so this argument doesn't apply and Amazon (which is the most used e-commerce site in the world *and* has the worst URL space ever imagined!) shows that URL-space design does not impact usability, if the pages don't require so.


Yeah, but Amazon is a large catalogue of things, not a documentation covering lots of different subjects from introduction to details.

Actually, since geeks are used to hack into URLs but normal people do not, having a flat or bad URL space forces usability people to think about navigation in the page and not outside.


How much I dislike such sites that require me to go from the main page to go down to a particular page that I've already seen...

Another argument, and probably more important, is that a flat URL structure gives a sense of 'wikiness' that people have come to dislike.

Now, again, this is a false impression (inspired by a plethora of bad practices rather than effectual technological limitations) but a strong one nevertheless (I do feel the same about it at times).

But *exactly* because of that, I think we should be brave and show the world that a flat URL space *does not* automatically yield 'wiki-like' flat spaces that are extremely painful to navigate.

Flat numeric URL spaces have also extremely interesting advantages:

- pages can have their titles adjusted without impacting persistance (links are more solid over time)


Adjusting a title doesn't mean you change its content, in which case there's no need to change its name. And if it's content changes, then it's a different page with a different name.

- pages can be rearranged/repurposed/re-aggregated/re-used without impacting persistance


Agree for "rearranged" as a flat space allows to change the navigation tree without impacting path names. Now repurposing a page requires to change its name (or id) and re-aggregating means removing (aggregation) or adding (split) some pages.

Another question is the structure of URLs - the new efforts of Sylvain who wants to provide some docs in French needs some thinking where to put them.


Wait, wait! I haven't proposed to translate the docs!! This is a tremendous and effort! I proposed to just translate the introductory page to accompany the french-speaking mailing-list.


I propose

http://c.a.o/ ............... editable global docs (own repository)
http://c.a.o/fr/ ............. editable global docs in French (own repository)
http://c.a.o/2.2/ ............ editable docs of 2.2 (own repository)
http://c.a.o/2.2/fr/ ......... editable docs of 2.2 in French (own repository)
http://c.a.o/2.2.1/ .......... "frozen" docs of the 2.2.1 release
http://c.a.o/2.2.1/en/ ....... "frozen" French docs of the 2.2.1 release


I don't think we should have frozen docs at any time, they are included in the distributions anyway and those distributions will be persisted for the longest time.

Sun did this with the Java API did this and created a mess, people linked to java/1.4.2/ and then 1.4.3 was created and all links broke down.

If a document shipped in 2.1.3 has a bug and was fixed in 2.1.4, why would anybody want to see it? and if 2.1.4 removed something useful for 2.1.3, that's a bug and we should fix it in the doc, rather than make everything available on the web.

So I'm -1 on this.


Agree. We may want to keep around the docs for each major release (i.e. 2.0, 2.1, 2.2) as Tomcat does, but certainly not the docs for minor releases (i.e. 2.2 and 2.2.1).

As for french docs, I *strongly* think that we should do this thru content-negotiation rather than URL design. A person accessing the page with a french browser will get the page in french, that's all they have to know (and the page will have a series of flags that will trigger an overload in locale, but that's going to be a parameter of the URL, not part of it).

The language a page is written, just like the data-type of the page, should not belong in the URL.

This makes the URL space way more "solid" overtime: I can link to

 http://cocoon.apache.org/2.2/3984948

and *be sure* that it will be there a few years from now and, by then, maybe a translation in my native language would have poped up!


And why shouldn't e.g. http://cocoon.apache.org/2.1/userdocs/flow/continuations.html not be there?

let's be brave!


Let's be brave and dive into a fog of meaningless URLs? I'm not convinced...

Sylvain

[1] http://en.wikipedia.org/wiki/Captcha#Circumvention

--
Sylvain Wallez                                  Anyware Technologies
http://www.apache.org/~sylvain           http://www.anyware-tech.com
{ XML, Java, Cocoon, OpenSource }*{ Training, Consulting, Projects }



Reply via email to