On 11/29/06, Andrew Nagy <[EMAIL PROTECTED]> wrote:
So ... while we are on this topic. You wouldn't want to index marcxml records in lucene, you would use marc21, right? Why deal with the overhead of xml if it is not necessary. We have to format our data no matter what for to best fit our storage/search system.
This seems like six of one and a half dozen of the other to me. I don't think Lucene cares either way which you use. In my mind, it is just a matter of preference... do I want to use XML tools (sax, xom, rexml) or MARC specific tools (marc4j, pymarc, ruby-marc). All could be used to build Lucene indices. On the other hand, what do I want to do with the data after it is indexed? Do I want to be able to display a whole record (versus just the little bit I might have stored in the Lucene index)? If so, I'd rather be working with XML. If I'm just pointing them back to my OPAC, though, I don't see much difference (other than personal preference) in the tool choice. Kevin
