Re: Connectors, Parsers, Plugin architecture

Chris Hostetter Mon, 15 Jan 2007 17:25:32 -0800

: (I mentioned this on solr-user, but people didn't seem to respond.)

You've got to give people more then a day dude ... especially on a weekend
(a three day weekend in many parts of the US)


I already replied to your solr-user message, but you've made some
slightly different points here i'd like to reply to...

: Solr aims at being an answer to "enterprise needs", by indexing
: structured data for different applications. However I think that many
: enterprises would like to be able to structure information themselves.

thta's exactly what Solr is about: letting a schema creator define
what the structure is, and letting putting data in whatever fields they
want.

: closed-source competition? It would be nice to index all of the following:
: 1) structured data
: 2) semi-strucured data
: 3) unstructured data
:
: As it seems Solr meets demand (1) and somewhat demand (2), but provides
: no easy or built-in way to meet demand (3). It is therefore currently up
: to the application developer to create this functionality. This is very

the problem with providing support for unstructured data out of hte box is
that it's got no strucutre :) ... how would Solr know what to do with the
binary data it finds? how would it know what charset to use when reading
thta data? ... assuming it gets character data, how does it know which
strings should go in which fields? how does it know which analyzers to
use?

some code somewhere has to make these decissions ... at the moment that
code needs to be provided by the user and run outside of Solr ... i
suspect it won't be long before much of that code can run inside of Solr
as a plugin, but it will still need to be provided by the user to parse
truely unstructured data.




-Hoss

Re: Connectors, Parsers, Plugin architecture

Reply via email to