Hi Ralph,

I haven't tested the PPT extractor with any other languages.  I remember
reading about other people having problems with different character sets
though.

Could you send a before and after example file here or to bugzilla?

-Ryan Rhodes


-----Original Message-----
From: Ralph Scheuer [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, July 28, 2004 10:01 AM
To: slide
Subject: MSPowerPointExtractor problem

Hello everybody,

When I was searching for a Java class to extract text from PowerPoint 
files, I accidentally discovered Slide.

I pulled the MSPowerPointExtractor class and some other stuff it 
depends on via CVS and tried it for some text extraction.

The method I used looks very similar to the provided example main 
method (see below).

However. when I tried to extract text from a German PowerPoint 
presentation, I had some problems with the encoding. I did not know 
which encoding to use, converting the output to ISO Latin 1 with my 
text editor solved only part of the problem (some German Umlaute were 
displayed correctly, some were not).

Is this a known issue or am I doing something wrong? Any hints for me?

Thanks in advance.

Ralph Scheuer

BTW. I am using Mac OS X 10.3.4 with JDK 1.4.2_03, the native encoding 
on this platform is MacRoman.


     public static String contentStringForData(NSData data){
        
        StringBuffer buf = new StringBuffer();
        try{
            ByteArrayInputStream input = data.stream();
            MSPowerPointExtractor ex = new MSPowerPointExtractor(null,
null);
        
            Reader reader = ex.extract(input);
        
            int c;
            do
                {
                    c = reader.read();
                
                    buf.append((char)c);
                }
            while( c != -1 );
        }catch(Exception e){
        
        }
        
        return buf.toString();
     }

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to