Please help with ppt to txt code snippet...
Hello , I urgently and immediately need to convert some of my powerpoint files into txt files.I tried to use hslf but it's giving problems for slides with templates..Can anyone Please heilp with a code snippet do it..For your reference i am pasting my code here. <<--START CODE-->> String str; PowerPointExtractor ppe = newPowerPointExtractor(filename); str=ppe.getText(); FileWriter fw=new FileWriter("F:\\newppt.txt"); fw.write(str); fw.close(); ppe.close(); <<--END CODE-->> __ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com - To unsubscribe, e-mail: [EMAIL PROTECTED] Mailing List: http://jakarta.apache.org/site/mail2.html#poi The Apache Jakarta Poi Project: http://jakarta.apache.org/poi/
Re: 3.0 release (and maven)
Hi Jörg, I'm replying this to the poi-dev list as well as poi-user. It seems that Nick (who is building the current POI 3.0 RC candidates) could use some help with Maven. http://issues.apache.org/bugzilla/show_bug.cgi?id=39977 --- Additional Comments From [EMAIL PROTECTED] 2007-04-10 02:50 --- What do we need to change to fix this? (I'm not a maven user myself) The pom file is auto-generated by the maven-dist ant task, based on a template that's in svn. Any patches to build.xml or the template poi.pom appreciated :) Are the guy? I hope so. A release would be sweet! BTW - I would be helpful to the guys clearing bugs before release if anyone who has submitted a bug in bugzilla would test against RC3 and provide a result. If your bug is still there, and you have not already done so, please provide a test case. I expect that large HSSF / lower memory impact is an RFE for 3,X or 4.0. I have an idea (I'm sure that there are a few others) but I'm not talking about it until the 3.0 release is DONE! Best Regards, Dave On Apr 17, 2007, at 2:29 PM, Jörg Hohwiller wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hi there, everybody is waiting for the 3.0 release and everything in head is better than an almost 3 years old release. Anyways take the time you need to have a release that you are happy with. I already migrated my project from 2.5.1-final-20040804 to 3.0-rc3 with good results. I did not see the point for binary incompatibility of HSSF (was it UnicodeString instead of String as return-type? cant remember) but that does not matter for me. The WordExtractor is also very useful and will allow me to kick out tm-extractors. Are you planning to put the 3.0 release into the maven repository as soon as it is out? I would be very pleased if this could happen quickly. If I can help anyhow on this task, just let me know. Is there something to discuss, that could be done now rather than after the release has been pulled out? E.g. if the goupId should change from "poi" to "org.apache.poi" what would make sense (lucene did do this too). Regards Jörg -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.5 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGJSAgmPuec2Dcv/8RAsU3AJ9dTh0Q1E7iiqnKrRriaq8MjHQe2wCgj+H7 cm5icKwYfePhr10EafiM6Ww= =bMdG -END PGP SIGNATURE- - To unsubscribe, e-mail: [EMAIL PROTECTED] Mailing List: http://jakarta.apache.org/site/mail2.html#poi The Apache Jakarta Poi Project: http://jakarta.apache.org/poi/ - To unsubscribe, e-mail: [EMAIL PROTECTED] Mailing List: http://jakarta.apache.org/site/mail2.html#poi The Apache Jakarta Poi Project: http://jakarta.apache.org/poi/
3.0 release (and maven)
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hi there, everybody is waiting for the 3.0 release and everything in head is better than an almost 3 years old release. Anyways take the time you need to have a release that you are happy with. I already migrated my project from 2.5.1-final-20040804 to 3.0-rc3 with good results. I did not see the point for binary incompatibility of HSSF (was it UnicodeString instead of String as return-type? cant remember) but that does not matter for me. The WordExtractor is also very useful and will allow me to kick out tm-extractors. Are you planning to put the 3.0 release into the maven repository as soon as it is out? I would be very pleased if this could happen quickly. If I can help anyhow on this task, just let me know. Is there something to discuss, that could be done now rather than after the release has been pulled out? E.g. if the goupId should change from "poi" to "org.apache.poi" what would make sense (lucene did do this too). Regards Jörg -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.5 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGJSAgmPuec2Dcv/8RAsU3AJ9dTh0Q1E7iiqnKrRriaq8MjHQe2wCgj+H7 cm5icKwYfePhr10EafiM6Ww= =bMdG -END PGP SIGNATURE- - To unsubscribe, e-mail: [EMAIL PROTECTED] Mailing List: http://jakarta.apache.org/site/mail2.html#poi The Apache Jakarta Poi Project: http://jakarta.apache.org/poi/
Re: Encryption/password protected excel/word files
Since the apache servers are in the United States, and the Apache Foundation is a US not for profit corporation, all projects are required to meet US Export regulations with regard to encryption technology. See http://www.apache.org/licenses/exports/ Regards, Dave On Apr 17, 2007, at 12:10 PM, Rainer Klute wrote: Andrew C. Oliver schrieb: Guys lets stay away from encryption. We'd have to do all this registering with the government and a whole lot of hassle for not much benefit. Lame encryption to boot. This depends. We here in Europe or at least here in Germany don't have to do any registering with the government for encryption/decryption. Best regards Rainer Klute Rainer Klute IT-Consulting Dipl.-Inform. Rainer Klute E-Mail: [EMAIL PROTECTED] Körner Grund 24 Telefon: +49 172 2324824 D-44143 Dortmund Telefax: +49 231 5349423 OpenPGP fingerprint: E4E4386515EE0BED5C162FBB5343461584B5A42E - To unsubscribe, e-mail: [EMAIL PROTECTED] Mailing List: http://jakarta.apache.org/site/mail2.html#poi The Apache Jakarta Poi Project: http://jakarta.apache.org/poi/
Re: Encryption/password protected excel/word files
Andrew C. Oliver schrieb: > Guys lets stay away from encryption. We'd have to do all this > registering with the government and a whole lot of hassle for not much > benefit. Lame encryption to boot. This depends. We here in Europe or at least here in Germany don't have to do any registering with the government for encryption/decryption. Best regards Rainer Klute Rainer Klute IT-Consulting Dipl.-Inform. Rainer Klute E-Mail: [EMAIL PROTECTED] Körner Grund 24 Telefon: +49 172 2324824 D-44143 Dortmund Telefax: +49 231 5349423 OpenPGP fingerprint: E4E4386515EE0BED5C162FBB5343461584B5A42E signature.asc Description: OpenPGP digital signature
Re[2]: Encryption/password protected excel/word files
On Tue, 17 Apr 2007, Yegor Kozlov wrote: I wonder if Microsoft uses encryption compatible with javax.crypto.*. If yes, we have a chance to decode it. Otherwise it is not worth the trouble. I don't think they do. For PPT, there's a choice of about 10 different encryption options, almost all of which have very similar names, and most of them have Microsoft somewhere in their name... I think that supporting encrypting / decrypting the files is not going to be possible. However, we might be able to detect encrypted files (as we do for ppt), but it'll take some work. Nick - To unsubscribe, e-mail: [EMAIL PROTECTED] Mailing List: http://jakarta.apache.org/site/mail2.html#poi The Apache Jakarta Poi Project: http://jakarta.apache.org/poi/
Re[2]: Encryption/password protected excel/word files
I wonder if Microsoft uses encryption compatible with javax.crypto.*. If yes, we have a chance to decode it. Otherwise it is not worth the trouble. Yegor NB> On Tue, 17 Apr 2007, Justin Warren wrote: >> I should mention that the exceptions don't really tell if the files are >> password protected or not. For word, I catch an >> ArrayIndexOutOfBoundsException, or java.lang.NegativeArraySizeException. >> I'm guessing that is not the expected behaviour. NB> With powerpoint, we did find one record we could look for early on that NB> indicates if the file is encrypted or not. For the others, we haven't NB> spotted anything suitable. NB> The problem is that if the file is encrypted, lots of the core records are NB> there, but there data is encrytped, and hence garbage if you try to read NB> it as if it wasn't. Unless we can tell very early on that a file is NB> encrypted, we can't just look through the record list looking for the NB> encrypted record flag, since the parent records can't be read properly. NB> Instead, we must find either a absolute offset to an indicator, or one NB> non encrypted record at a given location that'll have a child that tells NB> you it's encrypted. NB> If someone encrypts both the properties and the document, it's easy, as NB> you can tell at the poifs level. If they just encrypt the document, it's NB> hard. See EncryptedSlideShow in hslf for an example of how to do it for NB> powerpoint. Any suggestions for a similar way to do it for word or excel NB> gratefully received :) NB> Nick NB> - NB> To unsubscribe, e-mail: [EMAIL PROTECTED] NB> Mailing List: http://jakarta.apache.org/site/mail2.html#poi NB> The Apache Jakarta Poi Project: http://jakarta.apache.org/poi/ - To unsubscribe, e-mail: [EMAIL PROTECTED] Mailing List: http://jakarta.apache.org/site/mail2.html#poi The Apache Jakarta Poi Project: http://jakarta.apache.org/poi/
Re: Encryption/password protected excel/word files
Guys lets stay away from encryption. We'd have to do all this registering with the government and a whole lot of hassle for not much benefit. Lame encryption to boot. Nick Burch wrote: On Tue, 17 Apr 2007, Justin Warren wrote: I should mention that the exceptions don't really tell if the files are password protected or not. For word, I catch an ArrayIndexOutOfBoundsException, or java.lang.NegativeArraySizeException. I'm guessing that is not the expected behaviour. With powerpoint, we did find one record we could look for early on that indicates if the file is encrypted or not. For the others, we haven't spotted anything suitable. The problem is that if the file is encrypted, lots of the core records are there, but there data is encrytped, and hence garbage if you try to read it as if it wasn't. Unless we can tell very early on that a file is encrypted, we can't just look through the record list looking for the encrypted record flag, since the parent records can't be read properly. Instead, we must find either a absolute offset to an indicator, or one non encrypted record at a given location that'll have a child that tells you it's encrypted. If someone encrypts both the properties and the document, it's easy, as you can tell at the poifs level. If they just encrypt the document, it's hard. See EncryptedSlideShow in hslf for an example of how to do it for powerpoint. Any suggestions for a similar way to do it for word or excel gratefully received :) Nick - To unsubscribe, e-mail: [EMAIL PROTECTED] Mailing List: http://jakarta.apache.org/site/mail2.html#poi The Apache Jakarta Poi Project: http://jakarta.apache.org/poi/ -- From Windows/Exchange to Linux/Meldware Buni Meldware Communication Suite Email, Calendaring, ease of configuration/administration http://buni.org - To unsubscribe, e-mail: [EMAIL PROTECTED] Mailing List: http://jakarta.apache.org/site/mail2.html#poi The Apache Jakarta Poi Project: http://jakarta.apache.org/poi/
RE: Encryption/password protected excel/word files
On Tue, 17 Apr 2007, Justin Warren wrote: I should mention that the exceptions don't really tell if the files are password protected or not. For word, I catch an ArrayIndexOutOfBoundsException, or java.lang.NegativeArraySizeException. I'm guessing that is not the expected behaviour. With powerpoint, we did find one record we could look for early on that indicates if the file is encrypted or not. For the others, we haven't spotted anything suitable. The problem is that if the file is encrypted, lots of the core records are there, but there data is encrytped, and hence garbage if you try to read it as if it wasn't. Unless we can tell very early on that a file is encrypted, we can't just look through the record list looking for the encrypted record flag, since the parent records can't be read properly. Instead, we must find either a absolute offset to an indicator, or one non encrypted record at a given location that'll have a child that tells you it's encrypted. If someone encrypts both the properties and the document, it's easy, as you can tell at the poifs level. If they just encrypt the document, it's hard. See EncryptedSlideShow in hslf for an example of how to do it for powerpoint. Any suggestions for a similar way to do it for word or excel gratefully received :) Nick - To unsubscribe, e-mail: [EMAIL PROTECTED] Mailing List: http://jakarta.apache.org/site/mail2.html#poi The Apache Jakarta Poi Project: http://jakarta.apache.org/poi/
RE: Encryption/password protected excel/word files
I should mention that the exceptions don't really tell if the files are password protected or not. For word, I catch an ArrayIndexOutOfBoundsException, or java.lang.NegativeArraySizeException. I'm guessing that is not the expected behaviour. thanks -Original Message- From: Justin Warren Sent: Monday, April 16, 2007 2:49 PM To: poi-user@jakarta.apache.org Subject: Encryption/password protected excel/word files Hi all, I am trying to detect if a file is password protected before reading it (ie, if it is password protected, I want to ignore it). Right now, I am just catching exceptions that get thrown, but I was wondering if there was another way of doing this. Thanks - To unsubscribe, e-mail: [EMAIL PROTECTED] Mailing List: http://jakarta.apache.org/site/mail2.html#poi The Apache Jakarta Poi Project: http://jakarta.apache.org/poi/