Hi Daniel,

> -----Original Message-----
> From: Daniel Florey [mailto:[EMAIL PROTECTED]
> Sent: Dienstag, 24. Februar 2004 13:23
> To: Slide Developers Mailing List
> Subject: Re: Full Text Search for MS Word and Excel files?
> 
> 
> Hi Martin,
> my proposal would look like this:
> 
> public interface Extractor {
>       /**
>       * Will be called from extractor framework before 
> content and properties will 
> be stored
>       */
>       public void extract(InputStream content) throws 
> ExtractException;

agreed

>       
>       /**
>        * gets extracted property value from the resource, for 
> example "author"
>        * for a word doc, ...
>        *
>       */
>       public String getPropertyValue(String propertyName);
> 
>       /**
>       * gets a description of all properties that are 
> provided by this extractor.
>       * Can be used by indexing framework to e.g. generate 
> columns in index table 

Of course the store / indexer could do whatever it wants with the 
properties, but I think, the normal case should be to write the 
properties into DescriptorStore as NodeProperties. So these properties
can be exposed to DASL. So what about following comment:

* Can be used to be stored as NodeProperty in DescriptorStore


>       */
>       public PropertyDescriptor[] getPropertyDescriptors();
> }
> 
> I prefer InputStream for content because the whole document 
> doesn't have to be 
> loaded into memory.

agreed.


Best regards,
Martin

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to