I had this problem, I don't have an answer for woodstox, but here's a valid **workaround** using reflection that you can call after you have your pojo's...  

call it like this:
...
List list = delegate.doSearch(queryObject);
Validator val = new Validator();
val.cleanStrings(list);
return list;
...


// <<<<<<<< Start Validator.java
package au.gov.environment.imgws.service.helper;

import org.apache.commons.logging.Log;
import org.apache.commons.logging.LogFactory;

import java.util.Collection;
import java.lang.reflect.Method;

import au.com.sra.framework.vo.IValueObject;
import au.gov.environment.imgws.service.vo.ImageSearch;

/**
 * Created by IntelliJ IDEA.
 * User: mkerle
 * Date: 26/07/2007
 * Time: 10:25:09
 */
public class Validator {
    private static final Log log = LogFactory.getLog(Validator.class);

    /**
     * Some of the data returned by the ImageSearch contains non-ascii
     * characters that stuffs up the Woodstox XmlWriter. This method
     * @param dirty
     */
    public void cleanStrings(Object dirty){
        if(dirty instanceof Collection){
            //handle a collection
            Collection c = (Collection)dirty;
            for(Object o: c){
                cleanObject(o);
            }
        }else if(dirty instanceof IValueObject){ //IValueObject is our pojo bean marker interface
            cleanObject(dirty);
        }
        else {
         
        }
    }

    //handles any kind of JavaBean
    private void cleanObject(Object dirty){
        Method[] methods = dirty.getClass().getMethods();
        Class[] paramTypes = {String.class };
        String attr;
        for(Method m: methods){
            if(m.getReturnType().equals(String.class)
                    && m.getName().startsWith("get")
                    && m.getParameterTypes().length == 0){
                //we now have a getter for a String property on a bean
                try{
                    attr = m.getName();
                    attr = attr.substring(3,attr.length());
                    String s = (String)m.invoke(dirty);
                    if(s!=null){
                        s = validateASCII(dirty, attr, s);
                    }
                    if(s!=null){
                        Method setString = dirty.getClass().getMethod("set"+attr, paramTypes);
                        if(setString==null){
                            throw new IllegalStateException("Missing setter for getter '" + attr + "' on " +
                                            dirty.getClass().getCanonicalName());
                        }
                        setString.invoke(dirty, s);
                    }
                }catch(Exception e){
                    log.error(e);
                }
            }
        }
    }



    private String validateASCII(Object dirty, String column, String s) {
        boolean changed = false;
        if(dirty instanceof ImageSearch){
            ImageSearch image = (ImageSearch) dirty;
            if(image != null && image.getBarcode().equals("eadig00962")){
                image.getBarcode();
            }
        }
        for(int i = 0; i <  s.length(); i++){
            char c = s.charAt(i);
            if(c < ASCII.SPACE.value()
                    && c != ASCII.CR.value()
                    && c != ASCII.LF.value()
                    && c != ASCII.TAB.value()){
                //ah-ha! we've got one of the buggers! ok lets sanitise it
                // and log it to output.
                log.error("Illegal character '0x" + Integer.toHexString(c) + "' found in attribute " + column);
                log.error(dirty.toString());
                s = s.substring(0,i) + " " + s.substring(i+1, s.length());
                changed = true;
            }
        }
        return changed?s:null;
    }
}

enum ASCII{
    CR(13),
    LF(10),
    TAB(8),
    SPACE(32);
   
    private final int c;
    ASCII(int b){c = b;  }
    int value(){return c;}
}

// <<<<<<<<<<<<<end Validator.java

Matthew Kerle


Roman Dolgov wrote:
Hi All,

Is there any way to let woodstox convert illegal characters into 'character entity'? (or strip them out).

The problem I am facing, that some of my data may contain illegal characters, which causes com.ctc.wstx.exc.WstxIOException: Invalid white space character.. exception.
I want to avoid running check on all my java beans and instead just have one place where everything get 'fixed'.

Any ideas how to best handle this problem, either by configuring woodstox or providing some custom writer are welcome.

Thanks,
Roman
--------------------------------------------------------------------- To unsubscribe from this list please visit: http://xircles.codehaus.org/manage_email

Reply via email to