It would be great if this information went at least into the FAQ, and even better if we added a page to the site documentation. I'm thinking maybe a whole page titled "Integrating with Solr", which would walk you through the process and the pitfalls. What do you think?
Karl On Wed, Mar 30, 2011 at 11:39 AM, Erlend Garåsen <[email protected]> wrote: > > Solr 1.4.1 has several bugs which makes it difficult to deploy MCF on a > application server such as Resin. I have struggled a lot with some of these > bugs and decided to share my experiences in case others have the same > problems. > > First I figured out that I had to upgrade Tika to version 0.8 in order to > extract the content of MS Office documents etc. Solr 1.4.1 ships with Tika > 0.4 and will not work: > https://issues.apache.org/jira/browse/SOLR-1902 > > Here you have basically two options: > 1. Install the following branch: > http://svn.apache.org/viewvc/lucene/solr/branches/branch-1.4/ > 2. Install the latest version from trunk (not recommended for production > use). > > Then I figured out that I couldn't parse dates correctly. You have the > option in ExtractingRequestHandler to specify different date formats by the > following example: > <lst name="date.formats"> > <str>yyyy-MM-dd</str> > <str>dd.MM.yyyy</str> > </lst> > > This will cause a lazy loading error due to the following bug: > https://issues.apache.org/jira/browse/SOLR-1756 > > You have the following workaround: > 1. Install the branch mentioned above and then install the following patch: > https://issues.apache.org/jira/secure/attachment/12434831/SOLR-1756.patch > 2. Install the latest version from trunk. > > Remember to rebuild Solr and place the necessary jar files in a separate > folder which your application server has access to (apache-solr-cell*.jar, > Tika and its depencencies). > > Erlend > > -- > Erlend Garåsen > Center for Information Technology Services > University of Oslo > P.O. Box 1086 Blindern, N-0317 OSLO, Norway > Ph: (+47) 22840193, Fax: (+47) 22852970, Mobile: (+47) 91380968, VIP: 31050 >
