Jukka Zitting wrote:
> Hi,
> 
> On Sun, May 17, 2009 at 9:03 AM, Robert Burrell Donkin
> <[email protected]> wrote:
>> IMHO it makes sense to factor out an interface, retain the existing
>> implementation and create a separate module. this will allow assembler
>> who don't want to use tika to create applications that don't use it.
> 
> How about using the org.apache.tika.detect.Detector interface (see below)?
> 
> Tika comes with default implementations of the interface, but it
> should be straightforward to implement the interface also based on
> alternative implementations.

i see this as a stepping stone. tika already supports most of the
heuristics rat uses so IMHO it would make more sense to feed back rules
upstream (either into the default typer, or a variant tuned for
development).

a couple of issues that suggest that this might be better than jumping
to tika right away:

1. in terms of interface reuse ATM tika trunk doesn't offer a minimal api[1]
2. the latest release (tika 0.2) is not modular

- robert

[1] IMHO breaking out a tika api module with minimal dependencies would
encourage wider use of tika's basic abstractions

Reply via email to