HI,

I have removed the Application Client stuff (the boostrap). And run the extractor as a standalone application (under the same JDK, with all the same librairie) and it works. I really dont know how/why it works now, but it does.

Thanks

Etienne

Inactive hide details for [EMAIL PROTECTED][EMAIL PROTECTED]


          [EMAIL PROTECTED]

          09/11/2005 02:06 PM

          Veuillez répondre à
          "POI Users List" <[email protected]>

A

[email protected]

cc


Objet

Re: Fw: got an error when running on UNIX-AIX: illegal block count!

Can you upgrade that to JDK 1.4.x?

[EMAIL PROTECTED] wrote:
>
> Thanks Andrew  for you idea,
>
> I really thaught you got it right, but it didn't work well.
>
> I have tried the "FileInputStream fStream = new Buffered InputStream(new
> FileInputStream(wordDoc));" way, but I got the same error.
>
> I have tried to put -Dfile.encoding=ISO-8559-1 and -Dfile.encoding=8559-1
> my "ear laucher" and I got the same result plus a
>
> "Warning:  java.io.UnsupportedEncodingException: 8559-1" warning for both
> ISO.
>
> I have tried with XP and 2002 Word document. I got the same result.
>
> I am using and old AIX (version 4, release 3) and the IBMJDK 1.3.1. The
> running application is a " (J2EE) Application Client" running on WAS 4.0.6.
>
> I could easilly get rid of the boostrap of WAS 4.0.6 if the problem lie
> there (j2ee libs). It is a Spring application, and I just need the
> datasource for the WAS. It could be replace by a straight connection.
>
> Thanks
>
> Etienne
> Montreal
>
> [EMAIL PROTECTED]
> ----- R?achemin? par Etienne Laverdiere/VMD/Desjardins le 09/11/2005 01:50
> PM -----
>                                                                            
>              [EMAIL PROTECTED]                                            
>              rg                                                            
>                                                                          A
>              09/11/2005 12:52          [email protected]        
>              PM                                                         cc
>                                                                            
>                                                                      Objet
>              Veuillez r?pondre         Re: got an error when running on    
>                      ?                 UNIX-AIX: illegal block count!      
>              "POI Users List"                                              
>              <[EMAIL PROTECTED]                                            
>                .apache.org>                                                
>                                                                            
>                                                                            
>                                                                            
>
>
>
>
> Try
>
>  >
>  >   public Object parse(Object file) {
>  >             File wordDoc = new File((String) file);
>  >
>  >             WordExtractor we = new WordExtractor();
>  >             String fullText = "";
>  >             try {
>  >                   FileInputStream fStream = new Buffered InputStream(new
>  >                   FileInputStream(wordDoc));
>  >                   fullText = we.extractText(fStream); <---- ERROR HERE
>  >             } catch (FileNotFoundException e1) {
>  >                   logger.error("FileNotFound while parsing word
> document "
>  > + e1);
>  >                   e1.printStackTrace();
>  >             } catch (Exception e) {
>  >                   logger.error("Error while parsing word document " +
> e);
>  >                   e.printStackTrace();
>  >             }
>  >             return fullText;
>  >       }
>
> You probably don't see it elsewhere because AIX's VM and IO support is
> really slow.  While I love AIX, because it is a UNIX variant and I love
> UNIX but it certainly is not the best UNIX and the IBM VM is frankly
> pathetic and uses a decisively retro garbage collection.  Thus your
> stream is getting behind.  Since we don't inherently do the buffering,
> POIFS just pukes unless you use buffered input stream... (which you're
> naughty for not doing for all files anyhow)
>
> If that doesn't work pass -Dfile.encoding=ISO-8559-1 (or if that doesn't
> work try 8559-1)
>
> It could also be that AIX is a red herring and that this DOC is pre Word
> 6 and thus doesn't use OLE2CDF format or actually is blank-blank
> (meaning no document in the DOC file just the surrounding OLE wrapper)
>
> -Andy
>
> [EMAIL PROTECTED] wrote:
>
>>HI all,
>>
>>I have a strange problem when I deploy my word document extracting
>>application on AIX (Unix). I have run many time the application on
>
> windows
>
>>using WSAD and I never got this problem for the word document. All other
>>document are well read (PDF, Excel, Txt) only the word document seems to
>>jam.
>>I use the textmining library to do the extraction.
>>
>>
>>This is the error I get :
>>
>>
>>2005-11-08 16:02:21,939 ERROR [P=689750:O=0:CT] (?:?) - Error while
>
> parsing
>
>>word document java.io.IOException: Illegal block count; minimum count is
>
> 1,
>
>>got 0 instead
>>java.io.IOException: Illegal block count; minimum count is 1, got 0
>
> instead
>
>>        at
>>
>
> org.apache.poi.poifs.storage.BlockAllocationTableReader.<init>(BlockAllocationTableReader.java(Compiled
>
>
>> Code))
>>        at
>>
>
> org.apache.poi.poifs.filesystem.POIFSFileSystem.<init>(POIFSFileSystem.java(Compiled
>
>
>> Code))
>>        at
>>
>
> org.textmining.text.extraction.WordExtractor.extractText(WordExtractor.java(Compiled
>
>
>> Code))
>>        at
>>
>
> ca.ulaval.bibl.lius.index.MSWord.WordIndexer.parse(WordIndexer.java(Inlined
>
>>Compiled Code))
>>        at
>>
>
> ca.ulaval.bibl.lius.index.MSWord.WordIndexer.getPopulatedCollection(WordIndexer.java(Compiled
>
>
>> Code))
>>        at
>>ca.ulaval.bibl.lius.index.Indexer.createLuceneDocument(Indexer.java:87)
>>        at
>>
>
> ca.ulaval.bibl.lius.index.MSWord.WordIndexer.createLuceneDocument(WordIndexer.java:81)
>
>
>>        at
>>
>
> ca.ulaval.bibl.lius.index.Indexer.createLuceneDocument(Indexer.java(Compiled
>
>
>> Code))
>>        at
>>
>
> com.vmd.intranet.research.index.bean.IndexerRamBean.indexFile(IndexerRamBean.java(Compiled
>
>
>> Code))
>>        at
>>
>
> com.vmd.intranet.research.index.bean.IndexerRamBean.indexFolder(IndexerRamBean.java(Compiled
>
>
>> Code))
>>        at
>>
>
> com.vmd.intranet.research.index.bean.IndexerRamBean.indexFolder(IndexerRamBean.java:153)
>
>
>>        at
>>
>
> com.vmd.intranet.research.index.bean.IndexerRamBean.processIndexing(IndexerRamBean.java:137)
>
>
>>        at
>>
>
> com.vmd.intranet.research.index.IndexFilesLauncher.processIndexing(IndexFilesLauncher.java:123)
>
>
>>        at
>>
>
> com.vmd.intranet.research.index.IndexFilesLauncher.main(IndexFilesLauncher.java:60)
>
>
>>        at java.lang.reflect.Method.invoke(Native Method)
>>        at
>>
>
> com.ibm.websphere.client.applicationclient.launchClient.createContainerAndLaunchApp(launchClient.java:448)
>
>
>>        at
>>
>
> com.ibm.websphere.client.applicationclient.launchClient.main(launchClient.java:304)
>
>
>>        at java.lang.reflect.Method.invoke(Native Method)
>>        at com.ibm.ws.bootstrap.WSLauncher.main(WSLauncher.java:158)
>>
>>
>>And this is the code I use. Is there a trivial mistake I made??
>>
>>
>>  public Object parse(Object file) {
>>            File wordDoc = new File((String) file);
>>
>>            WordExtractor we = new WordExtractor();
>>            String fullText = "";
>>            try {
>>                  FileInputStream fStream = new
>>                  FileInputStream(wordDoc);
>>                  fullText = we.extractText(fStream); <---- ERROR HERE
>>            } catch (FileNotFoundException e1) {
>>                  logger.error("FileNotFound while parsing word document
>
> "
>
>>+ e1);
>>                  e1.printStackTrace();
>>            } catch (Exception e) {
>>                  logger.error("Error while parsing word document " + e);
>>                  e.printStackTrace();
>>            }
>>            return fullText;
>>      }
>>
>>
>>
>>Thanks for answering!
>>
>>Please put my email address in cc!
>>[EMAIL PROTECTED]
>>
>>Etienne
>>Montreal
>>
>>- L'int?grit? des informations transmises dans ce courriel n?est pas
>>garantie par Valeurs mobili?res Desjardins qui d?cline toute
>
> responsabilit?
>
>>quant aux dommages caus?s par leur modification frauduleuse. - Ce
>
> courriel
>
>>est confidentiel et est ? l?usage exclusif de son destinataire. Toute
>>personne qui re?oit celui-ci par erreur doit en informer imm?diatement
>
> son
>
>>exp?diteur et le d?truire sur-le-champ. Toute autre  utilisation des
>>informations qu?il contient est strictement interdite. - Le pr?sent
>>avertissement ne limite aucunement tout autre avertissement plus
>
> restrictif
>
>>qui vous aurait ?t? transmis par Valeurs mobili?res Desjardins.
>>- The integrity of the transmitted information in this E-mail is not
>>guaranteed by Desjardins Securities which accepts no liability for any
>>damage caused by its fraudulent alteration.  - This E-mail is
>
> confidential
>
>>and is intended for the sole use of the recipient or authorized
>>representative of the recipient. Any person who receives this E-mail by
>>mistake shall immediately notify the sender and destroy it. Any other use
>>of the information therein is strictly prohibited. - In no manner does
>
> this
>
>>notice limit other more restrictive warnings which may have been
>>transmitted to you by Desjardins Securities.
>
>
>
> --
> Andrew C. Oliver
> SuperLink Software, Inc.
>
> Java to Excel using POI
>
http://www.superlinksoftware.com/services/poi
> Commercial support including features added/implemented, bugs fixed.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> Mailing List:    
http://jakarta.apache.org/site/mail2.html#poi
> The Apache Jakarta Poi Project:  
http://jakarta.apache.org/poi/
>
>
>
> - L'int?grit? des informations transmises dans ce courriel n?est pas
> garantie par Valeurs mobili?res Desjardins qui d?cline toute responsabilit?
> quant aux dommages caus?s par leur modification frauduleuse. - Ce courriel
> est confidentiel et est ? l?usage exclusif de son destinataire. Toute
> personne qui re?oit celui-ci par erreur doit en informer imm?diatement son
> exp?diteur et le d?truire sur-le-champ. Toute autre  utilisation des
> informations qu?il contient est strictement interdite. - Le pr?sent
> avertissement ne limite aucunement tout autre avertissement plus restrictif
> qui vous aurait ?t? transmis par Valeurs mobili?res Desjardins.
> - The integrity of the transmitted information in this E-mail is not
> guaranteed by Desjardins Securities which accepts no liability for any
> damage caused by its fraudulent alteration.  - This E-mail is confidential
> and is intended for the sole use of the recipient or authorized
> representative of the recipient. Any person who receives this E-mail by
> mistake shall immediately notify the sender and destroy it. Any other use
> of the information therein is strictly prohibited. - In no manner does this
> notice limit other more restrictive warnings which may have been
> transmitted to you by Desjardins Securities.


--
Andrew C. Oliver
SuperLink Software, Inc.

Java to Excel using POI
http://www.superlinksoftware.com/services/poi
Commercial support including features added/implemented, bugs fixed.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
Mailing List:    
http://jakarta.apache.org/site/mail2.html#poi
The Apache Jakarta Poi Project:  
http://jakarta.apache.org/poi/




- L'intégrité des informations transmises dans ce courriel n’est pas garantie par Valeurs mobilières Desjardins qui décline toute responsabilité quant aux dommages causés par leur modification frauduleuse. - Ce courriel est confidentiel et est à l’usage exclusif de son destinataire. Toute personne qui reçoit celui-ci par erreur doit en informer immédiatement son expéditeur et le détruire sur-le-champ. Toute autre utilisation des informations qu’il contient est strictement interdite. - Le présent avertissement ne limite aucunement tout autre avertissement plus restrictif qui vous aurait été transmis par Valeurs mobilières Desjardins.
- The integrity of the transmitted information in this E-mail is not guaranteed by Desjardins Securities which accepts no liability for any damage caused by its fraudulent alteration. - This E-mail is confidential and is intended for the sole use of the recipient or authorized representative of the recipient. Any person who receives this E-mail by mistake shall immediately notify the sender and destroy it. Any other use of the information therein is strictly prohibited. - In no manner does this notice limit other more restrictive warnings which may have been transmitted to you by Desjardins Securities.

Reply via email to