Re: Python word to text
2009/9/1 Tino Wildenhain : > Am 01.09.2009 13:42, schrieb Nitebirdz: >> >> On Tue, Sep 01, 2009 at 11:38:30AM +0200, BJörn Lindqvist wrote: >>> >>> Hello everybody, >>> >>> I'm looking for a pure Python solution for converting word documents >>> to text. App Engine doesn't allow external programs, which means that >>> external programs like catdoc and antiword can't be used. Anyone know >>> of any? >>> >> >> A quick search returned this: >> >> http://code.activestate.com/recipes/279003/ >> >> >> Did you give it a try? > > Thats a funny advice. Did you read that receipe? ;-) > "Requires the Python for Windows extensions, and MS Word." > how does this match with "App Engine doesn't allow external programs"? :-) > > For excel this would be easy but word - Björn, did you check google api > if you would be able to access google docs for this? I did not, thanks for the tip! The system I managed to hack together uploads the .doc to a google docs account and then retrieves it again as plain text. It works but sure feels kind of silly. It's not very reliable because if google has some kind of problem with their docs application it doesn't work at all. Plus the method is dirt slow due to the latency of all the http calls. But better than nothing. -- mvh Björn -- http://mail.python.org/mailman/listinfo/python-list
Re: Python word to text
On Tue, Sep 01, 2009 at 03:20:29PM +0200, Tino Wildenhain wrote: >> >> A quick search returned this: >> >> http://code.activestate.com/recipes/279003/ >> >> >> Did you give it a try? > > Thats a funny advice. Did you read that receipe? ;-) > "Requires the Python for Windows extensions, and MS Word." > how does this match with "App Engine doesn't allow external programs"? :-) > Sorry, you're absolutely right. I did notice it required Windows, but didn't see any comments in the original message that this wasn't to be run on Windows. As for the issue regarding external programs, I assumed it only referred to the ones explictly mentioned or similar (catdoc, antiword, etc.). My apologies. -- http://mail.python.org/mailman/listinfo/python-list
Re: Python word to text
2009/9/1 BJörn Lindqvist : > Hello everybody, > > I'm looking for a pure Python solution for converting word documents > to text. App Engine doesn't allow external programs, which means that > external programs like catdoc and antiword can't be used. Anyone know > of any? > You could use the google docs api (http://code.google.com/apis/documents/docs/3.0/developers_guide_protocol.html#DownloadingDocsAndPresentations) -- Kind Regards -- http://mail.python.org/mailman/listinfo/python-list
Re: Python word to text
BJörn Lindqvist wrote: 2009/9/1 Nitebirdz : On Tue, Sep 01, 2009 at 11:38:30AM +0200, BJörn Lindqvist wrote: Hello everybody, I'm looking for a pure Python solution for converting word documents to text. App Engine doesn't allow external programs, which means that external programs like catdoc and antiword can't be used. Anyone know of any? A quick search returned this: http://code.activestate.com/recipes/279003/ It requires windows. I'm moderately confident that no (published) solution exists for this without relying on an installed Word or an external program of the kind you mentioned. Obviously, there's nothing to stop someone creating a Python module which does the equivalent, possibly by wrapping the core of the catdoc/antiword code in a Python module or by recoding its functionality in Python. But I imagine you knew that :) If you were talking Excel, you'd be in luck thanks to the sterling work done by John Machin and others. But I imagine that the market for word doc interchange / conversion is considerably smaller, especially within restricted environments. Depending on the source of your docs, it would be possible to save them as, eg, XML or something for which a converter is available in Python. Even text-only, I suppose. But I suppose that you're asking because that's not a possibility? TJG -- http://mail.python.org/mailman/listinfo/python-list
Re: Python word to text
Am 01.09.2009 13:42, schrieb Nitebirdz: On Tue, Sep 01, 2009 at 11:38:30AM +0200, BJörn Lindqvist wrote: Hello everybody, I'm looking for a pure Python solution for converting word documents to text. App Engine doesn't allow external programs, which means that external programs like catdoc and antiword can't be used. Anyone know of any? A quick search returned this: http://code.activestate.com/recipes/279003/ Did you give it a try? Thats a funny advice. Did you read that receipe? ;-) "Requires the Python for Windows extensions, and MS Word." how does this match with "App Engine doesn't allow external programs"? :-) For excel this would be easy but word - Björn, did you check google api if you would be able to access google docs for this? Regards Tino smime.p7s Description: S/MIME Cryptographic Signature -- http://mail.python.org/mailman/listinfo/python-list
Re: Python word to text
2009/9/1 Nitebirdz : > On Tue, Sep 01, 2009 at 11:38:30AM +0200, BJörn Lindqvist wrote: >> Hello everybody, >> >> I'm looking for a pure Python solution for converting word documents >> to text. App Engine doesn't allow external programs, which means that >> external programs like catdoc and antiword can't be used. Anyone know >> of any? >> > > A quick search returned this: > > http://code.activestate.com/recipes/279003/ It requires windows. -- mvh Björn -- http://mail.python.org/mailman/listinfo/python-list
Re: Python word to text
On Tue, Sep 01, 2009 at 11:38:30AM +0200, BJörn Lindqvist wrote: > Hello everybody, > > I'm looking for a pure Python solution for converting word documents > to text. App Engine doesn't allow external programs, which means that > external programs like catdoc and antiword can't be used. Anyone know > of any? > A quick search returned this: http://code.activestate.com/recipes/279003/ Did you give it a try? -- http://mail.python.org/mailman/listinfo/python-list
Python word to text
Hello everybody, I'm looking for a pure Python solution for converting word documents to text. App Engine doesn't allow external programs, which means that external programs like catdoc and antiword can't be used. Anyone know of any? Thanks in advance. -- mvh Björn -- http://mail.python.org/mailman/listinfo/python-list