: (I mentioned this on solr-user, but people didn't seem to respond.) You've got to give people more then a day dude ... especially on a weekend (a three day weekend in many parts of the US)
I already replied to your solr-user message, but you've made some slightly different points here i'd like to reply to... : Solr aims at being an answer to "enterprise needs", by indexing : structured data for different applications. However I think that many : enterprises would like to be able to structure information themselves. thta's exactly what Solr is about: letting a schema creator define what the structure is, and letting putting data in whatever fields they want. : closed-source competition? It would be nice to index all of the following: : 1) structured data : 2) semi-strucured data : 3) unstructured data : : As it seems Solr meets demand (1) and somewhat demand (2), but provides : no easy or built-in way to meet demand (3). It is therefore currently up : to the application developer to create this functionality. This is very the problem with providing support for unstructured data out of hte box is that it's got no strucutre :) ... how would Solr know what to do with the binary data it finds? how would it know what charset to use when reading thta data? ... assuming it gets character data, how does it know which strings should go in which fields? how does it know which analyzers to use? some code somewhere has to make these decissions ... at the moment that code needs to be provided by the user and run outside of Solr ... i suspect it won't be long before much of that code can run inside of Solr as a plugin, but it will still need to be provided by the user to parse truely unstructured data. -Hoss
