Hi Sergiu,
On Tue, Mar 15, 2011 at 16:17, Sergiu Dumitriu <[email protected]> wrote:
> On 03/15/2011 07:27 PM, Víctor A. Rodríguez (Bit-Man) wrote:
>> Hi,
>>
>> we're using Xwiki as our main local knowledge repository (kind of) and
>> we've started some gardening.
>> We did some assessment and some minimal cleanup but as far as we can
>> see we'll need to do some automated wiki gardening : broken links
>> detection, duplicated content, etc.
>>
>> Do you use some automated tools to do it ? or simply do the cleaning
>> manually ?
>
> No, not really. What can be done is to detect such content.
>
> There's the "Orphaned Pages" tab in "Document Index", which can list
> pages which don't have a valid parent.
>
> This snippet can detect broken links:
> http://extensions.xwiki.org/xwiki/bin/Extension/All+Broken+Links
Thanks a lot, I'll take a look to the extensions
> Duplicated content is harder to find, and depends on what you understand
> by "duplicate content". If that's exact copy, character by character,
> you could use something like this: (works on mysql, depends on the rdbms
> implementing MD5 method)
>
> {{velocity}}
> #foreach($doc in $xwiki.search("select doc.fullName from XWikiDocument
> doc where MD5(doc.content) in (select MD5(d.content) from XWikiDocument
> d group by MD5(d.content) having count(*) > 1)"))
> * [[$doc]]
> #end
> {{/velocity}}
>
> This only checks the content field, and will report all documents based
> on the template+sheet pattern.
>
> If you mean "fairly similar to", then that's not something that can be
> done out of the box, but you could integrate a third party tool
> dedicated to finding similar documents, feed it the content of the wiki,
> and check its results.
Agreed, it's a kind of "ambitious goal"
> As for the cleanup part, that has to be done manually. Or, with a bit of
> scripting, you can do whatever you want with the reported documents.
Thanks a lot for your help !
--
Víctor A. Rodríguez (http://www.bit-man.com.ar)
El bit Fantasma (Bit-Man)
Programming: love it or leave it
_______________________________________________
users mailing list
[email protected]
http://lists.xwiki.org/mailman/listinfo/users