On 6/28/07, Eric Pugh <[EMAIL PROTECTED]> wrote:
Sounds great to me! In the future, should I be communicating via JIRA issues?
Code should go in JIRA issues, but you can discuss it before hand on the dev list if you like.
I have a PDF handler modeled on the CSVHandler that allows you to stream a PDF document to Solr and extract the text and store it.
Cool! Any thoughts of a general framework for going from unstructured document -> lucene document with fields? It feels like utilizing Apache Tika here would be the way to go (although it's in the really early stages). -Yonik