[
https://issues.apache.org/jira/browse/ABDERA-60?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12524835
]
Chris Berry commented on ABDERA-60:
-----------------------------------
Just to be positive. I have added code to the previous JUnit that actually
retrieves text from the XML w/ woodstox.
This is pretty unequivocal now...
package com.homeaway.hcdata.store.provider.blogs;
import junit.framework.Test;
import junit.framework.TestCase;
import junit.framework.TestSuite;
import javax.xml.stream.XMLStreamReader;
import javax.xml.stream.XMLInputFactory;
import javax.xml.stream.XMLStreamException;
import java.io.FileInputStream;
import com.ctc.wstx.stax.WstxInputFactory;
import org.apache.commons.logging.Log;
import org.apache.commons.logging.LogFactory;
/**
*/
public class WoodstoxTest extends TestCase {
static private Log log = LogFactory.getLog( WoodstoxTest.class );
private static final String userdir = System.getProperty( "user.dir" );
private static final String filename = userdir +
"/var/blogs/cberry/99/9999/en/blog_9999.xml" ;
public static Test suite()
{ return new TestSuite( WoodstoxTest.class ); }
public void tearDown() throws Exception
{ super.tearDown(); }
public void setUp() throws Exception
{ super.tearDown(); }
public void testWoodstox1() throws Exception {
// we will simply walk the doc and see if it throws an Exception
XMLInputFactory xif = new WstxInputFactory();
XMLStreamReader r = xif.createXMLStreamReader( new FileInputStream(
filename ) );
while (r.hasNext()) r.next();
r.close();
}
public void testWoodstox2() throws Exception {
// we will simply walk the doc and see if it throws an Exception
XMLInputFactory xif = new WstxInputFactory();
XMLStreamReader reader = xif.createXMLStreamReader( new
FileInputStream( filename ) );
while ( reader.hasNext() ) {
printEventInfo( reader );
}
reader.close();
}
private static void printEventInfo(XMLStreamReader reader) throws
XMLStreamException {
int eventCode = reader.next();
String val = null;
switch (eventCode) {
case 1 :
val= reader.getLocalName();
log.debug("event = START_ELEMENT");
log.debug("Localname = "+val);
break;
case 2 :
val= reader.getLocalName();
log.debug("event = END_ELEMENT");
log.debug("Localname = "+val);
break;
case 3 :
val= reader.getPIData();
log.debug("event = PROCESSING_INSTRUCTION");
log.debug("PIData = " + val);
break;
case 4 :
val= reader.getText();
log.debug("event = CHARACTERS");
log.debug("Characters = " + val);
break;
case 5 :
val= reader.getText();
log.debug("event = COMMENT");
log.debug("Comment = " + val);
break;
case 6 :
val= reader.getText();
log.debug("event = SPACE");
log.debug("Space = " + val);
break;
case 7 :
log.debug("event = START_DOCUMENT");
log.debug("Document Started.");
break;
case 8 :
log.debug("event = END_DOCUMENT");
log.debug("Document Ended");
break;
case 9 :
val= reader.getText();
log.debug("event = ENTITY_REFERENCE");
log.debug("Text = " + val);
break;
case 11 :
val= reader.getText();
log.debug("event = DTD");
log.debug("DTD = " + val);
break;
case 12 :
val= reader.getText();
log.debug("event = CDATA");
log.debug("CDATA = " + val);
break;
}
}
}
> Invalid UTF-8 chars in the AbderaClient
> ---------------------------------------
>
> Key: ABDERA-60
> URL: https://issues.apache.org/jira/browse/ABDERA-60
> Project: Abdera
> Issue Type: Bug
> Affects Versions: 0.3.0
> Environment: N/A
> Reporter: Chris Berry
> Fix For: 0.3.0
>
> Attachments: abdera-utf8-bug.tar.gz
>
>
> After upgrading to the latest 0.3-SNAPSHOT SVN trunk (on ~8/27/2007)) from a
> 0.3-SNAPSHOT download from a couple of months ago
> And after making all required modifications (to catch up with all the API
> changes), I am seeing "Invalid UTF-8"
> Note that these errors only occur in the AbderaClient when I call
> "entry.getContent()"
> I have attached a small, self-contained JUnit test case which
> reproduces/demonstrates this issue.
> It runs and builds out-of-the-box (using mvn install).
> There is also a README.txt that details the output/issue
> This JUnit reproduces the error. It is as small as I could get it.
> My Atom Store is based on a Store and StoreProvider (based on code I received
> from Ugo Cei as a starting point)
> Note that all of the code in src/main/java is relatively fixed between the
> latest 0.3-SNAPSHOT and the 0.3-SNAPSHOT that works
> In other words, my code stayed as fixed as possible, and the latest
> 0.3-SNAPSHOT is the only real variable
> I'm not saying that the bug isn't in my code, Only that it never showed up
> until my upgrade to 0.3-SNAPSHOT.
> I actually suspect that it may be an issue w/ woodstox, which the latest
> 0.3-SNAPSHOT significantly upgrades.
> Note: I have looked very closely at the XML file(s) that is causing this
> issue.
> I used the Unix util; "iconv" on them. And AFAICT they do not contain
> improper UTF-8.
> Chris Berry
> chriswberry at gmail dot com
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.