Hi,

Importing files with XML-importer seems to work fine. But the Find&Merge
scenario's seem a bit more troublesome, I can't figure out how to find and
merge exact matches (not even tried the fuzzy-stuff).
Let me start by saying that I'm not a java-developer, I'm just trying to use
what is there. I'm looking for a working example with which I can tinker a
little. But I think the example-classes aren't working properly.

I started with the XML-importer examples. The class urlFinder worked after
updating the code a little (thanks to elfuego).
In Persons and MoviesFinder.java "tmpObj.getNode().parent" has to be
replaced with "tmpObj.getNode().getBuilder()"
The class then compiles fine.

The example builderfile for Moviesxxx has to be edited (name=guiname and the
end tag </field> is missing).
I changed the path where the class is found but that shouldn't be to
critical?!

When I try to import some of the exampledata I get errors in the logs and
nothing gets imported. It seems that there's an error in the class-file.
Someone maybe's got a clue of what 's wrong?

I've attached the source of the class I used.



2007-03-08 17:25:56,746 SERVICE search.legacy.ConstraintParser.fallback -
Failed to parse Constraint from search condition string:
    sqlConstraint = length(fullname) between 10 and 12
    exception: Unknown field (of builder personsxxx): "length"
Falling back to BasicLegacyConstraint...
java.lang.IllegalArgumentException: Unknown field (of builder personsxxx):
"length"
   at org.mmbase.storage.search.legacy.ConstraintParser.getField(
ConstraintParser.java:460)
   at org.mmbase.storage.search.legacy.ConstraintParser.getField(
ConstraintParser.java:550)
   at
org.mmbase.storage.search.legacy.ConstraintParser.parseSimpleCondition(
ConstraintParser.java:667)
   at org.mmbase.storage.search.legacy.ConstraintParser.parseCondition(
ConstraintParser.java:584)
   at org.mmbase.storage.search.legacy.ConstraintParser.toConstraint(
ConstraintParser.java:512)
   at org.mmbase.util.QueryConvertor.setConstraint(QueryConvertor.java:125)
   at org.mmbase.core.util.StorageConnector.getSearchQuery(
StorageConnector.java:712)
   at org.mmbase.module.core.MMTable.searchVector(MMTable.java:367)
   at org.mmbase.module.core.MMTable.search(MMTable.java:322)
   at org.mmbase.applications.xmlimporter.BasicFinder.findPersistentObjects
(BasicFinder.java:41)
   at org.mmbase.xmlimporter.demo.PersonsFinder.getClosePersistentObjects(
PersonsFinder.java:130)
   at org.mmbase.applications.xmlimporter.BasicFinder.findSimilarObject(
BasicFinder.java:130)
   at org.mmbase.applications.xmlimporter.Transaction.mergeObjects(
Transaction.java:805)
   at org.mmbase.applications.xmlimporter.TransactionsParser.endElement(
TransactionsParser.java:458)
   at
com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.endElement(
AbstractSAXParser.java:633)
   at
com.sun.org.apache.xerces.internal.impl.dtd.XMLDTDValidator.endNamespaceScope
(XMLDTDValidator.java:2108)
   at
com.sun.org.apache.xerces.internal.impl.dtd.XMLDTDValidator.handleEndElement
(XMLDTDValidator.java:2059)
   at
com.sun.org.apache.xerces.internal.impl.dtd.XMLDTDValidator.endElement(
XMLDTDValidator.java:932)
   at
com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanEndElement
(XMLDocumentFragmentScannerImpl.java:1241)
   at
com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch
(XMLDocumentFragmentScannerImpl.java:1685)
   at
com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument
(XMLDocumentFragmentScannerImpl.java:368)
   at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(
XML11Configuration.java:834)
   at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(
XML11Configuration.java:764)
   at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(
XMLParser.java:148)
   at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(
AbstractSAXParser.java:1242)
   at org.mmbase.applications.xmlimporter.TransactionsParser.parse(
TransactionsParser.java:601)
   at
org.mmbase.applications.xmlimporter.TransactionHandler.handleTransaction(
TransactionHandler.java:153)
   at org.mmbase.applications.xmlimporter.TransactionHandler$1.run(
TransactionHandler.java:109)
2007-03-08 17:25:56,749 SERVICE search.legacy.ConstraintParser.fallback -
Failed to parse Constraint from search condition string:
    sqlConstraint = length(title) between 20 and 22
    exception: Unknown field (of builder moviesxxx): "length"
Falling back to BasicLegacyConstraint...
java.lang.IllegalArgumentException: Unknown field (of builder moviesxxx):
"length"
   at org.mmbase.storage.search.legacy.ConstraintParser.getField(
ConstraintParser.java:460)
   at org.mmbase.storage.search.legacy.ConstraintParser.getField(
ConstraintParser.java:550)
   at
org.mmbase.storage.search.legacy.ConstraintParser.parseSimpleCondition(
ConstraintParser.java:667)
   at org.mmbase.storage.search.legacy.ConstraintParser.parseCondition(
ConstraintParser.java:584)
   at org.mmbase.storage.search.legacy.ConstraintParser.toConstraint(
ConstraintParser.java:512)
   at org.mmbase.util.QueryConvertor.setConstraint(QueryConvertor.java:125)
   at org.mmbase.core.util.StorageConnector.getSearchQuery(
StorageConnector.java:712)
   at org.mmbase.module.core.MMTable.searchVector(MMTable.java:367)
   at org.mmbase.module.core.MMTable.search(MMTable.java:322)
   at org.mmbase.applications.xmlimporter.BasicFinder.findPersistentObjects
(BasicFinder.java:41)
   at org.mmbase.xmlimporter.demo.MoviesFinder.getClosePersistentObjects(
MoviesFinder.java:130)
   at org.mmbase.applications.xmlimporter.BasicFinder.findSimilarObject(
BasicFinder.java:130)
   at org.mmbase.applications.xmlimporter.Transaction.mergeObjects(
Transaction.java:805)
   at org.mmbase.applications.xmlimporter.TransactionsParser.endElement(
TransactionsParser.java:458)
   at
com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.endElement(
AbstractSAXParser.java:633)
   at
com.sun.org.apache.xerces.internal.impl.dtd.XMLDTDValidator.endNamespaceScope
(XMLDTDValidator.java:2108)
   at
com.sun.org.apache.xerces.internal.impl.dtd.XMLDTDValidator.handleEndElement
(XMLDTDValidator.java:2059)
   at
com.sun.org.apache.xerces.internal.impl.dtd.XMLDTDValidator.endElement(
XMLDTDValidator.java:932)
   at
com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanEndElement
(XMLDocumentFragmentScannerImpl.java:1241)
   at
com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch
(XMLDocumentFragmentScannerImpl.java:1685)
   at
com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument
(XMLDocumentFragmentScannerImpl.java:368)
   at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(
XML11Configuration.java:834)
   at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(
XML11Configuration.java:764)
   at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(
XMLParser.java:148)
   at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(
AbstractSAXParser.java:1242)
   at org.mmbase.applications.xmlimporter.TransactionsParser.parse(
TransactionsParser.java:601)
   at
org.mmbase.applications.xmlimporter.TransactionHandler.handleTransaction(
TransactionHandler.java:153)
   at org.mmbase.applications.xmlimporter.TransactionHandler$1.run(
TransactionHandler.java:109)
2007-03-08 17:25:56,761 ERROR   mmbase.module.core.TransactionManager -
Can't resolve transaction AnTr1173370988731
2007-03-08 17:25:56,761 ERROR   mmbase.module.core.TransactionManager -
Nodes
[_exists='yes',title='Finalist the Movie
II',owner='import',intro='null',colorformat='null',playtime='null',year='2001',_number='import_m1',source='null',based='null',otype='15549',body='null',format='null',number='-1'
[EMAIL PROTECTED],
lastname='null',gender='null',_exists='yes',owner='import',firstname='null',_number='import_p3',particle='null',nick='Erik1',fullname='Erie
Visser',otype='11727',number='-1'
[EMAIL PROTECTED],
_dnumber='import_p3',_exists='no',owner='import',_number='import_AnRel1173370988732',rnumber='42',_snumber='import_m1',snumber='15732',dnumber='15731',dir='null',otype='5',number='15730'
[EMAIL PROTECTED]
2007-03-08 17:25:56,763 INFO
mmbase.applications.xmlimporter.Transaction- Stop transaction: Thu Mar
08 17:25:56 CET 2007
2007-03-08 17:25:56,764 INFO
mmbase.applications.xmlimporter.Transaction- Stop transaction: Thu Mar
08 17:25:56 CET 2007
2007-03-08 17:25:56,764 ERROR
mmbase.applications.xmlimporter.TransactionHandler - TransactionError :
org.mmbase.applications.xmlimporter.TransactionHandlerException: Can't
resolve transaction AnTr1173370988731[_exists='yes',title='Finalist the
Movie
II',owner='import',intro='null',colorformat='null',playtime='null',year='2001',_number='import_m1',source='null',based='null',otype='15549',body='null',format='null',number='-1'
[EMAIL PROTECTED],
lastname='null',gender='null',_exists='yes',owner='import',firstname='null',_number='import_p3',particle='null',nick='Erik1',fullname='Erie
Visser',otype='11727',number='-1'
[EMAIL PROTECTED],
_dnumber='import_p3',_exists='no',owner='import',_number='import_AnRel1173370988732',rnumber='42',_snumber='import_m1',snumber='15732',dnumber='15731',dir='null',otype='5',number='15730'
[EMAIL PROTECTED]
2007-03-08 17:25:56,764 ERROR
mmbase.applications.xmlimporter.TransactionHandler - ExceptionPage none
2007-03-08 17:25:56,764 WARN
mmbase.applications.xmlimporter.TransactionHandler - Transaction stopped at
: Thu Mar 08 17:25:56 CET 2007
/*
 * ClassName: MoviesFinder.java
 *
 * Date: dec. 1st. 2001
 *
 * Copyright notice:
 * This software is OSI Certified Open Source Software.
 * OSI Certified is a certification mark of the Open Source Initiative.
 *
 * The license (Mozilla version 1.0) can be read at the MMBase site.
 * See http://www.MMBase.org/license
 */

package org.mmbase.xmlimporter.demo;

import java.util.*;
import org.mmbase.applications.xmlimporter.*;

/**
 * This class implements the SimilarObjectFinder interface 
 * for "moviesxxx" objects, by comparing their "title" fields,
 * using the string comparison algorithm of FuzzyStringMatcher.
 * 
 * @author Rob van Maris
 * @version 0.1
 */
public class MoviesFinder extends BasicFinder {
    
    /** Parameter name used to set titleThreshold. */
    public final static String TITLE_THRESHOLD = "title_threshold";
    
    /** The titleThreshold value used. */
    private float titleThreshold;
    
    /** Creates new MoviesFinder. */
    public MoviesFinder() {}

    /**
     * Initializes this instance. This implementation requires a
     * value between 0 and 1 for the parameter "title_threshold".
     * @param params The initialization parameters, provided as
     * name/value pairs (both String).
     * @throws TransactionHandlerException when the parameter is not
     * specified, or its value cannot be parsed to a float.
     */
    public void init(HashMap params) throws TransactionHandlerException {
        // Get titleThreshold parameter.
        String param = (String) params.get(TITLE_THRESHOLD);
        if (param == null) {
            // Parameter not set.
            throw new TransactionHandlerException(
            "MoviesMatcher: parameter \"" 
            + TITLE_THRESHOLD + "\" not set");
        }
        try {
            titleThreshold = Float.parseFloat(param);
        } catch (NumberFormatException e) {
            // Invalid value for parameter.
            throw new TransactionHandlerException(
            "MoviesMatcher: invalid value for parameter \"" 
            + TITLE_THRESHOLD + "\": " + param);
        }
        super.init(params);
    }
    
    /**
     * Calculates matching rate for two movies. This implementation
     * calculates the matching rate by comparing just the "title" fields.
     * @param tmpObj1 The object for which the matching rate is wanted.
     * @param tmpObj2 The object to match with.
     * @return Matching rate, but 0.0 when the matching rate is less than
     * the titleThreshold value.
     */
    public float scoreNode(TmpObject tmpObj1, TmpObject tmpObj2) {
        // Get title for both objects.
        String title1 = (String) tmpObj1.getField("title");
        String title2 = (String) tmpObj2.getField("title");
        
        // Test for exact match.
        if (title2.equals(title1)) {
            return 1.0f;
            
        // Not exact match, so test for close match.
        } else {
            float matchRate 
                = FuzzyStringMatcher.getMatchRate(title1, title2);
            if (matchRate < titleThreshold) {
                // Not qualifying.
                return 0.0f;
            } else {
                // Qualifying.
                return matchRate;
            }
        }
    }
    
    /**
     * Gets MMBase id's for all objects from persistent cloud that
     * produce an exact match with the given object (possibly
     * including the object itself).
     * This implementation looks for movies with the same value
     * for the "title" field.
     * @param tmpObj The object to match with.
     * @return Collection of (Integer) MMBase id's for objects from the
     * persistent cloud that produce an exact match with the given
     * object.
     */
    public Collection getExactPersistentObjects(TmpObject tmpObj) {
        String title = (String)tmpObj.getField("title");
        return findPersistentObjects(tmpObj.getNode().getBuilder(),
            "title = '" + title + "'");
    }
    
    /**
     * Gets MMBase id's for all objects from persistent cloud that
     * might produce a qualifying match with the given object
     * (possibly including the object itself).
     * This implementation looks for all movies where the stringlength
     * of the "title" field is compatible with the titleThreshold 
     * value.
     * @param tmpObj The object to match with.
     * @return Collection of (Integer) MMBase id's for objects from the
     * persistent cloud that might produce a qualifying match with the
     * given object.
     */
    public Collection getClosePersistentObjects(TmpObject tmpObj) {
        String title = (String)tmpObj.getField("title");
        int min = (int) Math.ceil(title.length() * titleThreshold);
        int max = (int) Math.floor(title.length() / titleThreshold);
        return findPersistentObjects(tmpObj.getNode().getBuilder(),
            "length(title) between " + min + " and " + max);
    }
    
}
_______________________________________________
Users mailing list
[email protected]
http://lists.mmbase.org/mailman/listinfo/users

Reply via email to