I am new to py-lucene but have worked on java lucene 1.4.3. How I can index following types of files by using py-lucene? [word files, pdf , excel, xsl, xml, open office files] is there any support of 3rd party lib in py-lucene also?
If there are python equivalent plain-text filters for these types of files, use them. If not, you can try to integrate the java version into your build of PyLucene. Take a look at how the sandbox Highlighter package was integrated as an example. If the licensing of these third party tools allows it, and if there is enough interest, I could integrate them into PyLucene too.
Can you send me URLs to these Java packages ?
Andi.. _______________________________________________ pylucene-dev mailing list [email protected] http://lists.osafoundation.org/mailman/listinfo/pylucene-dev
