Anil Pachuri wrote on 2/21/13 3:22 PM: > > > Hi, > > Does Lucy have a utility to accept raw XML files as input? I have 50 XML > files and I need to index selected fields in them using Lucy. >
If you install SWISH::Prog::Lucy from CPAN, you get the swish3 tool installed which will index XML (and HTML et al) files for Lucy. You can specify which XML elements you want treated as Lucy fields with a configuration file. For example: # a document like <doc> <foo>bar</foo> </doc> # a config file like MetaNames foo PropertyNames foo # and then index the file like: % swish3 -F lucy -c configfile -i doc.xml # and search like: % swish3 -q foo:bar The configuration docs are at: http://swish-e.org/docs/swish-config.html You might also want to look at Dezi, which does the same thing with a server/client setup. http://dezi.org/ > Also, is there any general perl utility to merge multiple XML files or > convert these into tabular format? CPAN has many XML handling tools. I'm sure there's something there that will do most or all of what you want. -- Peter Karman . http://peknet.com/ . [email protected]
