Hello Nico,

I understand your usecase. You could also choose to only have a property
set on a document when it is not empty. We had a customer who wanted
this. Would that result in the same behavior you have? Because, then a
property will be only set when it is not empty, and when in the wrong
namespace, it will be empty...or do I miss something.

What I do not get from the extractors.xml below (though from the top of
my head, so I might miss something) is that for example in the first
extractor, you have the xpath:

xpath="//c:catalogName"/>

Now, where is the c: prefix namespace declared? 

Regards Ard

> 
> Ard,
> 
> The fact that only the namespace of the root element is taken 
> into account is what's all about. The idea behind it is that 
> you can definea set of different properties on different 
> documents while they are stored within the same location 
> (directory). Imagine a webshop where you store product 
> information and catalog information alongside eachother. This 
> makes it possible to use DASL queries that act only on 
> product files or catalog files.
> I don't know if there are many usecases. I know one at least ;-).
> 
> The extractor.xml file might look like:
> 
>   <extractor 
> classname="nl.hippo.slide.extractor.HippoSimpleXmlExtractor"
> uri="/files/default.preview/content/appdata/webshop" 
> content-type="text/xml
> | text/xml; charset=UTF-8 | application/xml">
>     <configuration 
> targetNamespace="http://www.mycompany.com/webshop/catalog
> ">
>       <instruction property="catalog_name" namespace="
> http://hippo.nl/cms/1.0"; xpath="//c:catalogName"/>
>     </configuration>
>   </extractor>
> 
>   <extractor 
> classname="nl.hippo.slide.extractor.HippoSimpleXmlExtractor"
> uri="/files/default.preview/content/appdata/webshop" 
> content-type="text/xml
> | text/xml; charset=UTF-8 | application/xml">
>     <configuration 
> targetNamespace="http://www.mycompany.com/webshop/product
> ">
>       <instruction property="product_name" namespace="
> http://hippo.nl/cms/1.0"; xpath="//p:productName"/>
>     </configuration>
>   </extractor>
> 
> 
> Kind regards
> 
> Nico Tromp
> 
> 
> On Thu, Oct 9, 2008 at 11:14 AM, Ard Schrijvers
> <[EMAIL PROTECTED]>wrote:
> 
> > Hello Nico,
> >
> > Thx first of all for sharing your code. Could you also post me an 
> > example of the extractor using this?
> >
> > Also, I see in the patch:
> >
> > String documentNamespace = 
> > document.getRootElement().getNamespaceURI();
> >
> > This means that the target namespace must match the 
> namespace of the 
> > root element, right? So, the root element namespace defines in this 
> > case whether a property will be set or won't be set. I am not sure 
> > whether this is a really common usecase. Are there many 
> usecases for 
> > this behavior?
> >
> > Regards Ard
> >
> > >
> > > Hi all,
> > >
> > > as part of the project I'm currently working on we needed 
> to be able 
> > > to have a extractors that only act on documents depending on the 
> > > namespace of the document. (See a ealier post 
> > > http://www.nabble.com/Setting-properties-per-document-namespac
> > > e-td19380940.htmlor
> > > http://www.nabble.com/Mixed-content-and-extractors-td19319614.
> > > html). I finally made some changes to the Hippo extractors.
> > > They are now able to take into account the namespace of the 
> > > document. If you specify a so called target namespace the 
> properties 
> > > are only set if the document has the same namespace. If you don't 
> > > specify the target namespace the properties are set on every 
> > > document. The target namespace is set as a attribute on the 
> > > 'configuration' element within the 'extractor' element. I 
> think it 
> > > should have been a attribute of the 'extractor' element but that 
> > > would mean a lot more changes and I just wanted it to work.
> > >
> > > If you like the idea, please feel free to use the code. And maybe 
> > > the Hippo guys can aply them to there code so everybody 
> can benifit 
> > > from it.
> > >
> > > Please see the attached diff files for the patches.
> > >
> > >
> > > Have fun
> > >
> > > Nico Tromp
> > >
> > ********************************************
> > Hippocms-dev: Hippo CMS development public mailinglist
> >
> > Searchable archives can be found at:
> > MarkMail: http://hippocms-dev.markmail.org
> > Nabble: http://www.nabble.com/Hippo-CMS-f26633.html
> >
> >
> ********************************************
> Hippocms-dev: Hippo CMS development public mailinglist
> 
> Searchable archives can be found at:
> MarkMail: http://hippocms-dev.markmail.org
> Nabble: http://www.nabble.com/Hippo-CMS-f26633.html
> 
> 
********************************************
Hippocms-dev: Hippo CMS development public mailinglist

Searchable archives can be found at:
MarkMail: http://hippocms-dev.markmail.org
Nabble: http://www.nabble.com/Hippo-CMS-f26633.html

Reply via email to