The spec says ...

"getEncoding() returns the encoding of the document, or null if not
specified. The
default value is "UTF-8". Specification of other values is implementation
dependent."

later it says in relation to a document example that has an <?xml version="
1.0"?> declaration ....

"The XML encoding default of "UTF-8" is applied to the creation of new
documents. After a load() it will have an encoding value from the document,
or
null, if no encoding value was found in the document."

So my understanding is that a new XMLDocument instance will change state on
loading a document with a header such as the above from reporting "UTF-8"
for the encoding before the load, and null after the load.  It's not clear
to me whether this state change happens if there is no declaration in the
incoming document being loaded.

Here's the Tuscany behaviour as far as I have experimented,  which I think
has some issues ...

  public void testEncoding2() throws IOException
  {
      TypeHelper types = hc.getTypeHelper();
      Type stringType = types.getType("commonj.sdo", "String");
      DataObject customerType = hc.getDataFactory().create("commonj.sdo",
"Type");
      customerType.set("uri", "http://example.com/simple";);
      customerType.set("name", "Simple");
      DataObject multiProperty = customerType.createDataObject("property");
      multiProperty.set("name", "name");
      multiProperty.set("type", stringType);
      types.define(customerType);
      DataObject obj = hc.getDataFactory().create("http://example.com/simple";,

      "Simple");
      obj.set("name", "John Smith");

      ByteArrayOutputStream baos = new ByteArrayOutputStream();
      hc.getXMLHelper().save(obj, "http://www.example.com/company"; ,
"company", baos);

      System.out.println("\nserializing newed instance with default\n");
      ByteArrayInputStream bais = new ByteArrayInputStream(baos.toString
().getBytes());
      XMLDocument xmlDoc = hc.getXMLHelper().load(bais);
      if( !"UTF-8".equals(xmlDoc.getEncoding()) )
      {
          fail("Encoding ('" + xmlDoc.getEncoding() +"' is not correct.
UTF-8 is the expected encoding.");
      }
      System.out.println(baos);

      StringWriter sw = new StringWriter();
      hc.getXMLHelper().save(xmlDoc, sw, null);

      String nodecl = sw.toString().substring(40);
      System.out.println("\nParsing doc with no encoding attr\n");
      System.out.println(nodecl);

      ByteArrayInputStream bais2 = new ByteArrayInputStream(nodecl.getBytes
());
      XMLDocument xmlDoc2 = hc.getXMLHelper().load(bais2);

      System.out.println("1) encoding = " + xmlDoc2.getEncoding());

      XMLDocument noDeclDirect = hc.getXMLHelper().load(nodecl);

      System.out.println("2) encoding = " + noDeclDirect.getEncoding());


      System.out.println("\nRound trip serializing doc that had no incoming
encoding attr\n");

      hc.getXMLHelper().save(xmlDoc2, System.out, null);

      System.out.println("Parsing and serializing an ascii document");
      String asciiDoc = "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n" +
nodecl;
      System.out.println(asciiDoc);
      XMLDocument xmlDoc3 = hc.getXMLHelper().load(asciiDoc);
      hc.getXMLHelper().save(xmlDoc3, System.out, null);



      xmlDoc.setEncoding(null);
      System.out.println("\nSerializing instance with encoding explicitly
set to null\n");
      System.out.println("Encoding reported as " + xmlDoc.getEncoding());
      try {
        StringWriter sw2 = new StringWriter();
        hc.getXMLHelper().save(xmlDoc, sw2, null);
        System.out.println(sw);
      }

      catch (NullPointerException e) {
        System.out.println("Tuscany gives null pointer exception on
serialization after setEncoding(null)");
      }
   }


gives, ......

serializing newed instance with default

<?xml version="1.0" encoding="UTF-8"?>
<company:company xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance";
xmlns:company="http://www.example.com/company";
    xmlns:simple="http://example.com/simple"; xsi:type="simple:Simple">
  <name>John Smith</name>
</company:company>

Parsing doc with no encoding attr

<company:company xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance";
xmlns:company="http://www.example.com/company";
    xmlns:simple="http://example.com/simple"; xsi:type="simple:Simple">
  <name>John Smith</name>
</company:company>
1) encoding = UTF-8
2) encoding = ASCII

Round trip serializing doc that had no incoming encoding attr

<?xml version="1.0" encoding="UTF-8"?>
<company:company xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance";
xmlns:company="http://www.example.com/company";
    xmlns:simple="http://example.com/simple"; xsi:type="simple:Simple">
  <name>John Smith</name>
</company:company>Parsing and serializing an ascii document
<?xml version="1.0" encoding="UTF-8"?>
<company:company xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance";
xmlns:company="http://www.example.com/company";
    xmlns:simple="http://example.com/simple"; xsi:type="simple:Simple">
  <name>John Smith</name>
</company:company>
<?xml version="1.0" encoding="ASCII"?>
<company:company xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance";
xmlns:company="http://www.example.com/company";
    xmlns:simple="http://example.com/simple"; xsi:type="simple:Simple">
  <name>John Smith</name>
</company:company>
Serializing instance with encoding explicitly set to null

Encoding reported as null
Tuscany gives null pointer exception on serialization after
setEncoding(null)

================================

I have to stop looking at this for a while,  so I'm dumping this snapshot of
investigation here to be picked up later

Kelvin


On 25/01/2008, kelvin goodson <[EMAIL PROTECTED]> wrote:
>
> Hi Amita,
>
> I blew away my entire repo and still got a clean build.  However,  it has
> got me looking into what exactly the spec means when it says ...
>
> getEncoding() returns the encoding of the document, or null if not
> specified. The
> default value is "UTF-8". Specification of other values is implementation
> dependent.
>
>
> I'll report thoughts shortly ..
>
> Kelvin.
>
> On 25/01/2008, Amita Vadhavkar <[EMAIL PROTECTED]> wrote:
> >
> > I am making this a separate thread and referring it in
> > http://www.mail-archive.com/[email protected]/msg02447.html
> > as this can be my local setup issue and so don't want to mix with the
> > release thread.
> >
> > 0] I checked that the EMF version in my classpath is 2.2.3.
> >
> > 1]Eclipse 3.2.1
> >
> > http://help.eclipse.org/help32/index.jsp?topic=/org.eclipse.emf.doc/references/javadoc/org/eclipse/emf/ecore/xmi/XMLResource.html
> > getEncoding
> > public java.lang.String getEncoding()
> >     Get the XML encoding for this resource. The default is ASCII.
> >
> > 2]SDO Spec says default is UTF-8.
> >
> > 3]To adhere to SDO Spec, the rootmost place for change can be
> > XMLDocumentImpl class itself as this is
> > where SDO Impl implements commonj XMLDocument and contain EMF's
> > XMLResource.
> > So if inside protected XMLDocumentImpl(ExtendedMetaData
> > extendedMetaData,
> > Object options)
> > we do resource.setEncoding("UTF-8"); after the Resource is obtained from
> > EMF's ResourceSet we are all set for save as well as load.
> >
> > 4]If we look at the current 1545 patch, it was doing setEncoding() in
> > XMLHelperImp.createDocument(). This gets called during save() but not
> > during
> > load().
> > In this same class if we look at load(InputStream inputStream, String
> > locationURI, Object options) and load(Reader inputReader, String
> > locationURI, Object options)
> > , here we create *new instances* of XMLDocumentImpl which do not have
> > UTF-8
> > set as from EMF's viewpoint, the default is ASCII. So, during load
> > operation
> > the getEncoding() resulted in ASCII even if the document is saved using
> > 'UTF-8'.
> >
> > 5]But still 3] as well as 1545.patch is not correct , for 2 reasons -
> >   1> what will be the way for end user to use *any other encoding* other
> >
> > than UTF-8 during save and get the same encoding back during load?
> >   2> Also a logical assumption - when we save a resource with a
> > particular
> > encoding, load should return the resource with same encoding - how to
> > make
> > this
> >   happen? Because 4]*new instance* of XMLDocument will always default to
> > ASCII(1]).
> >
> > For 1>, one way can be - user can pass XMLResource.OPTION_ENCODINGduring
> > save and
> > inside SDO Impl, we need to make sure this option is passed to EMF. And
> > when
> > such option is passed in during save() it should be honored and
> > default UTF-8 should not be used.
> > Ref:
> >
> > http://help.eclipse.org/help32/index.jsp?topic=/org.eclipse.emf.doc/references/javadoc/org/eclipse/emf/ecore/xmi/XMLResource.html
> >
> > OPTION_ENCODING
> > public static final java.lang.String OPTION_ENCODING
> >     Specify the XML encoding to be used *during save*.
> >
> > But for 2> i.e. get the same encoding back during load without any
> > special
> > option passing and without setEncoding() - what is the solution?
> >
> > Also found below - so the correct behavior should be available in EMF
> > 2.2.3and further.
> >
> > http://www.eclipse.org/modeling/emf/news/relnotes.php?project=emf&version=2.2.x
> >
> >     * 2.2.2
> >     * 2.2.2RC2 (2 bugs fixed)
> >           o 173815 Bad link exist on EMF/SDO/XSD Developer Guide
> >           o 173681 encoding is ignored during XMLResource#load
> > *********************************************************************************************************************
> >
> >
> > surefire-reports
> >
> > Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 3.172sec
> > <<< FAILURE!
> > testEncoding(org.apache.tuscany.sdo.test.XMLHelperTestCase)  Time
> > elapsed:
> > 3.125 sec  <<< FAILURE!
> > junit.framework.AssertionFailedError: Encoding ('ASCII' is not correct.
> > UTF-8 is the expected encoding.
> >     at junit.framework.Assert.fail(Assert.java:47)
> >     at org.apache.tuscany.sdo.test.XMLHelperTestCase.testEncoding (
> > XMLHelperTestCase.java:313)
> >     at org.apache.tuscany.sdo.test.XMLHelperTestCase.testEncoding(
> > XMLHelperTestCase.java:313)
> >     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> >     at sun.reflect.NativeMethodAccessorImpl.invoke (
> > NativeMethodAccessorImpl.java:39)
> >     at sun.reflect.DelegatingMethodAccessorImpl.invoke(
> > DelegatingMethodAccessorImpl.java:25)
> >     at java.lang.reflect.Method.invoke(Method.java:597)
> >     at junit.framework.TestCase.runTest (TestCase.java:154)
> >     at junit.framework.TestCase.runBare(TestCase.java:127)
> >     at junit.framework.TestResult$1.protect(TestResult.java:106)
> >     at junit.framework.TestResult.runProtected(TestResult.java:124)
> >     at junit.framework.TestResult.run(TestResult.java:109)
> >     at junit.framework.TestCase.run(TestCase.java:118)
> >     at junit.framework.TestSuite.runTest(TestSuite.java:208)
> >     at junit.framework.TestSuite.run ( TestSuite.java:203)
> >     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> >     at sun.reflect.NativeMethodAccessorImpl.invoke(
> > NativeMethodAccessorImpl.java:39)
> >     at sun.reflect.DelegatingMethodAccessorImpl.invoke (
> > DelegatingMethodAccessorImpl.java:25)
> >     at java.lang.reflect.Method.invoke(Method.java:597)
> >     at org.apache.maven.surefire.junit.JUnitTestSet.execute(
> > JUnitTestSet.java:213)
> >     at
> >
> > org.apache.maven.surefire.suite.AbstractDirectoryTestSuite.executeTestSet(
> > AbstractDirectoryTestSuite.java:138)
> >     at
> > org.apache.maven.surefire.suite.AbstractDirectoryTestSuite.execute(
> > AbstractDirectoryTestSuite.java:125)
> >     at org.apache.maven.surefire.Surefire.run(Surefire.java :132)
> >     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> >     at sun.reflect.NativeMethodAccessorImpl.invoke(
> > NativeMethodAccessorImpl.java:39)
> >     at sun.reflect.DelegatingMethodAccessorImpl.invoke (
> > DelegatingMethodAccessorImpl.java :25)
> >     at java.lang.reflect.Method.invoke(Method.java:597)
> >     at
> > org.apache.maven.surefire.booter.SurefireBooter.runSuitesInProcess(
> > SurefireBooter.java:290)
> >     at org.apache.maven.surefire.booter.SurefireBooter.main (
> > SurefireBooter.java:818)
> >
> >
> >
> > *********************************************************************************************************************
> > Any clue?
> >
> > Regards,
> > Amita
> >
>
>

Reply via email to