Hi Guido: (CC-ing project-cdsware-developers)
On Fri, 26 Mar 2010, Guido Pelzer wrote: >> On Fri, 26 Mar 2010, Guido Pelzer wrote: >>> many thanks for your mail. bibmatch works good. do you have a >>> detailed comment of bibmatch options? >> >> Currently not much docs besides the guide: >> >> <http://invenio-demo.cern.ch/help/admin/bibmatch-admin-guide> >> >> But it may soon be updated, since Marko will work on small fuzzy-like >> features: >> >> <https://savannah.cern.ch/task/?3273> >> > > yes, i had already seen, but i have problems with the advanced options, > especially > -m --mode=(a|e|o|p|r)[3] > -o --operator=(a|o)[2] -> and/or??? > different between --print-new and --print-match The mode and operators are taken from search engine API. The output streams NEW prints unmatched records, MATCH matched records when there was exactly one dupe-like hit, and AMBIGUOUS prints matched records when there was more than one dupe-like hit. Personally I would prefer bibmatch to produce more than one output stream at the same go, for example in case of two output streams: $ bibmatch foo.xml > foo_unmatched.xml 2> foo_matched.xml so that one has to process only its output files without diffing WRT the input file. Since Marko is attacking this module WRT some fuzziness, we can as well take this opportunity and change/prettify its API... Best regards -- Tibor Simko ** CERN Document Server ** <http://cds.cern.ch/>
