Santiago Gala wrote:
<snip>
> > > I'm fighting with a bug in the process of processing multibyte character
> > > channels in Jetspeed. If you are experiencing a bug with multibyte
> > > characters in Jetspeed, please give more details.
> > >
> >
> > I also worked with multibyte character channels (Big5), what problem you
> > are
> > facing?
> >
>
> In cvs Jetspeed, if you run it, you will get a channel
> (www.javable.com/rus/rss.shtml) in russian. The file gets correctly to the cache
> as UTF-8 (even if its encoding attrbitue says other thing). But the characters
> get broken in the portlet, as "?".
>
> Even when I modify the encoding attribute in the cache, the channel is not
> visualized correctly. I think there is a problem either in xerces, in xalan or
> in the way we call them to process the channel. I am trying to fix this before
> we release a new beta.
>
> I would like to test also with Chinese, Japanese, Korean, ... channels, but I
> don't know of any OCS feed. Do you have such a list?
>
I solved it!
In the way, I updated to latest turbine (just in case), but it was not the culprit.
The culprit was ECS (and also a bug on our side).
I am commiting the updated RSSPortlet.java which fixes the bug. I am also commiting
a patched ecs.jar, and I will remove the old one if nobody oposes.
I had to patch org.apache.ecs.GenericElement.java, which plainly did not know how to
convert multibyte characters back to a String.
As I'm not involved in ECS, I put here the patch and I send this message to the ecs
list and Jon Stevens. It is not very clean, but the principle is: never use a
ByteArray to write characters to, since you will loose the high byte. I have tested
the changes, and I have found no problem.
To check for the problem, start Jetspeed, wait until feed are processed, and search
"javable". There should be two channels, one in English and one in Russian. The
Russian one should be filled with "?".
After the patch, It should display plenty of Cyrillic characters.
It should work now with Japanese, Chinese, etc., channels.
==== Patch for org.apache.ecs.GenericElement.java
--- /home/sgala/GenericElement.java Sat Nov 18 00:24:52 2000
+++ org/apache/ecs/GenericElement.java Sat Nov 18 00:40:41 2000
@@ -56,6 +56,8 @@
import java.io.OutputStream;
import java.io.BufferedOutputStream;
import java.io.ByteArrayOutputStream;
+import java.io.Writer;
+import java.io.StringWriter;
import java.io.IOException;
import java.io.UnsupportedEncodingException;
import java.io.Serializable;
@@ -715,20 +717,14 @@
*/
public final String toString()
{
- String out = null;
- try
- {
- ByteArrayOutputStream baos = new ByteArrayOutputStream();
- BufferedOutputStream bos = new BufferedOutputStream(baos);
- output(bos);
- bos.flush();
- out = baos.toString();
- bos.close();
- baos.close();
- }
- catch (IOException ioe)
- {
- }
+ String out = null;
+
+ StringWriter strwtr = new StringWriter( );
+ PrintWriter wtr = new PrintWriter ( strwtr );
+ output( wtr );
+ wtr.flush();
+ out = strwtr.toString();
+
return(out);
}
@@ -737,23 +733,15 @@
*/
public final String toString(String codeset)
{
- ByteArrayOutputStream baos = new ByteArrayOutputStream();
- BufferedOutputStream bos = new BufferedOutputStream(baos);
+
+ StringWriter strwtr = new StringWriter( );
+ PrintWriter wtr = new PrintWriter ( strwtr );
String out = null;
- try
- {
- output(bos);
- bos.flush();
- out = baos.toString(codeset);
- bos.close();
- baos.close();
- }
- catch (UnsupportedEncodingException use)
- {
- }
- catch (IOException ioe)
- {
- }
+
+ output( wtr );
+ wtr.flush();
+ out = strwtr.toString();
+
return(out);
}
=== End of patch
--
--------------------------------------------------------------
Please read the FAQ! <http://java.apache.org/faq/>
To subscribe: [EMAIL PROTECTED]
To unsubscribe: [EMAIL PROTECTED]
Archives and Other: <http://marc.theaimsgroup.com/?l=jetspeed>
Problems?: [EMAIL PROTECTED]