Hello all,

I would like to set up the broken link checker for my CMS, but before start, i need to know if i understood how does it works. As far as know, the checker creates an XML file into the repository that is the "database" of the inspected links. To create this document it needs to browse the repository in search of a webdavProperty called "links".

So, if this is right, i need to configure an extractor into the repository.

Now, I have a document like this:
------------------------------------
<?xml version="1.0" encoding="UTF-8"?>
<document>
  <metaCurSection>multimedia</metaCurSection>
  <taxonomies> </taxonomies>
  <primaryData lang="it">
    <content >
      <html>
        <body>
          <a href="http://www.google.com"; title="prova">testlink</a>
        </body>
      </html>
    </content>
    <shortDescription />
    <title>Test di Impaginazione Template</title>
  </primaryData>
  <attachments lang="it">
    <externalLinks>
      <externalLink label="Prova" order="1" url="http://www.google.com/"/>
    </externalLinks>
    <assets>
      <asset order="1" path="/binaries/sandbox/urb_part.gif"/>
    </assets>
    <images>
<image alt="Prova Formattazione" order="1" path="/binaries/sandbox/nx03_wallpaper01.jpg"/>
    </images>
    <relatedDocs>
<relatedDoc order="1" path="/content/taxonomies/ankonline/uffici/stampa/conferenze/2007/nuovodoc.xml"/>
    </relatedDocs>
  </attachments>
  <multimedia lang="it">
    <stream externalPath="/video/test.flv" repository="external"/>
  </multimedia>
  <secondaryData lang="it">
    <tickets />
    <other />
  </secondaryData>
  <contacts lang="it">
    <info />
    <timeTable />
    <telephones>
      <telephone number="112324345" order="1"/>
    </telephones>
    <faxes>
      <fax number="12121341" order="2"/>
    </faxes>
    <emails>
      <email address="[email protected]" order="3"/>
    </emails>
  </contacts>
</document>

I have to extract:
- The links on the html fields like "/document/PrimaryData/content"
- The extrenal Links on "/document/Attachments/extrenalLinks/externalLink"
- The internal Links on "/document/Attachments/relatedDocs/relatedDoc"
- The images on "document/Attachments/images/image"
- The assets on "document/Attachments/assets/asset"

Can someone show me an example for an extractor configuration?
Thanks in advance.

--
By MCM.

<< La teoria è quando si sa tutto ma non funziona niente.
La pratica è quando tutto funziona ma non si sa il perché.
In ogni caso si finisce con il coniugare la teoria con la pratica: non funziona niente e non si sa il perché. >>
(A. Einstein)
********************************************
Hippocms-dev: Hippo CMS development public mailinglist

Searchable archives can be found at:
MarkMail: http://hippocms-dev.markmail.org
Nabble: http://www.nabble.com/Hippo-CMS-f26633.html

Reply via email to