Hi Roman,
did you ever find an answer to your question about how to get Woodstox
to encode problem fields as character literal to deal with the illegal
character problem? The workaround that I came up with is fine but we'd
prefer not to use a 'hack' like this in a production system.
I'm thinking there must be something we can set in the aegis mapping
file (or as an annotation attribute) that will tell xFire to encode that
field such that illegal characters can be passed through. Anyone have
any ideas?
looking in the aegis mapping XSD there doesn't seem to be much in the
way of parameters to mark a field that might contain bad data...
http://xfire.codehaus.org/schemas/1.0/mapping.xsd
*Matthew Kerle
**IT Consultant
**SRA Information Technology*
*Canberra*
Ground Floor, 40 Brisbane Avenue
BARTON ACT 2600
Office: +61 2 6273 6122
Fax: +61 2 6273 6155
Mobile: +61404 096 863
Email: [EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]>
Web: www.sra.com.au
Matthew Kerle wrote:
I had this problem, I don't have an answer for woodstox, but here's a
valid **workaround** using reflection that you can call after you have
your pojo's...
call it like this:
...
List list = delegate.doSearch(queryObject);
Validator val = new Validator();
val.cleanStrings(list);
return list;
...
// <<<<<<<< Start Validator.java
package au.gov.environment.imgws.service.helper;
import org.apache.commons.logging.Log;
import org.apache.commons.logging.LogFactory;
import java.util.Collection;
import java.lang.reflect.Method;
import au.com.sra.framework.vo.IValueObject;
import au.gov.environment.imgws.service.vo.ImageSearch;
/**
* Created by IntelliJ IDEA.
* User: mkerle
* Date: 26/07/2007
* Time: 10:25:09
*/
public class Validator {
private static final Log log = LogFactory.getLog(Validator.class);
/**
* Some of the data returned by the ImageSearch contains non-ascii
* characters that stuffs up the Woodstox XmlWriter. This method
* @param dirty
*/
public void cleanStrings(Object dirty){
if(dirty instanceof Collection){
//handle a collection
Collection c = (Collection)dirty;
for(Object o: c){
cleanObject(o);
}
}else if(dirty instanceof IValueObject){ //IValueObject is our
pojo bean marker interface
cleanObject(dirty);
}
else {
}
}
//handles any kind of JavaBean
private void cleanObject(Object dirty){
Method[] methods = dirty.getClass().getMethods();
Class[] paramTypes = {String.class };
String attr;
for(Method m: methods){
if(m.getReturnType().equals(String.class)
&& m.getName().startsWith("get")
&& m.getParameterTypes().length == 0){
//we now have a getter for a String property on a bean
try{
attr = m.getName();
attr = attr.substring(3,attr.length());
String s = (String)m.invoke(dirty);
if(s!=null){
s = validateASCII(dirty, attr, s);
}
if(s!=null){
Method setString =
dirty.getClass().getMethod("set"+attr, paramTypes);
if(setString==null){
throw new IllegalStateException("Missing
setter for getter '" + attr + "' on " +
dirty.getClass().getCanonicalName());
}
setString.invoke(dirty, s);
}
}catch(Exception e){
log.error(e);
}
}
}
}
private String validateASCII(Object dirty, String column, String s) {
boolean changed = false;
if(dirty instanceof ImageSearch){
ImageSearch image = (ImageSearch) dirty;
if(image != null && image.getBarcode().equals("eadig00962")){
image.getBarcode();
}
}
for(int i = 0; i < s.length(); i++){
char c = s.charAt(i);
if(c < ASCII.SPACE.value()
&& c != ASCII.CR.value()
&& c != ASCII.LF.value()
&& c != ASCII.TAB.value()){
//ah-ha! we've got one of the buggers! ok lets sanitise it
// and log it to output.
log.error("Illegal character '0x" +
Integer.toHexString(c) + "' found in attribute " + column);
log.error(dirty.toString());
s = s.substring(0,i) + " " + s.substring(i+1, s.length());
changed = true;
}
}
return changed?s:null;
}
}
enum ASCII{
CR(13),
LF(10),
TAB(8),
SPACE(32);
private final int c;
ASCII(int b){c = b; }
int value(){return c;}
}
// <<<<<<<<<<<<<end Validator.java
*Matthew Kerle
***
Roman Dolgov wrote:
Hi All,
Is there any way to let woodstox convert illegal characters into
'character entity'? (or strip them out).
The problem I am facing, that some of my data may contain illegal
characters, which causes com.ctc.wstx.exc.WstxIOException: Invalid
white space character.. exception.
I want to avoid running check on all my java beans and instead just
have one place where everything get 'fixed'.
Any ideas how to best handle this problem, either by configuring
woodstox or providing some custom writer are welcome.
Thanks,
Roman
---------------------------------------------------------------------
To unsubscribe from this list please visit:
http://xircles.codehaus.org/manage_email
---------------------------------------------------------------------
To unsubscribe from this list please visit:
http://xircles.codehaus.org/manage_email