Re: Python word to text

2009-09-01 Thread BJörn Lindqvist
2009/9/1 Tino Wildenhain :
> Am 01.09.2009 13:42, schrieb Nitebirdz:
>>
>> On Tue, Sep 01, 2009 at 11:38:30AM +0200, BJörn Lindqvist wrote:
>>>
>>> Hello everybody,
>>>
>>> I'm looking for a pure Python solution for converting word documents
>>> to text. App Engine doesn't allow external programs, which means that
>>> external programs like catdoc and antiword can't be used. Anyone know
>>> of any?
>>>
>>
>> A quick search returned this:
>>
>> http://code.activestate.com/recipes/279003/
>>
>>
>> Did you give it a try?
>
> Thats a funny advice. Did you read that receipe? ;-)
> "Requires the Python for Windows extensions, and MS Word."
> how does this match with "App Engine doesn't allow external programs"? :-)
>
> For excel this would be easy but word - Björn, did you check google api
> if you would be able to access google docs for this?

I did not, thanks for the tip! The system I managed to hack together
uploads the .doc to a google docs account and then retrieves it again
as plain text. It works but sure feels kind of silly. It's not very
reliable because if google has some kind of problem with their docs
application it doesn't work at all. Plus the method is dirt slow due
to the latency of all the http calls. But better than nothing.


-- 
mvh Björn
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python word to text

2009-09-01 Thread Nitebirdz
On Tue, Sep 01, 2009 at 03:20:29PM +0200, Tino Wildenhain wrote:
>>
>> A quick search returned this:
>>
>> http://code.activestate.com/recipes/279003/
>>
>>
>> Did you give it a try?
>
> Thats a funny advice. Did you read that receipe? ;-)
> "Requires the Python for Windows extensions, and MS Word."
> how does this match with "App Engine doesn't allow external programs"? :-)
>

Sorry, you're absolutely right.  I did notice it required Windows, but
didn't see any comments in the original message that this wasn't to be
run on Windows.  As for the issue regarding external programs, I assumed
it only referred to the ones explictly mentioned or similar (catdoc,
antiword, etc.).  

My apologies.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python word to text

2009-09-01 Thread Gabriel
2009/9/1 BJörn Lindqvist :
> Hello everybody,
>
> I'm looking for a pure Python solution for converting word documents
> to text. App Engine doesn't allow external programs, which means that
> external programs like catdoc and antiword can't be used. Anyone know
> of any?
>

You could use the google docs api
(http://code.google.com/apis/documents/docs/3.0/developers_guide_protocol.html#DownloadingDocsAndPresentations)

-- 
Kind Regards
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python word to text

2009-09-01 Thread Tim Golden

BJörn Lindqvist wrote:

2009/9/1 Nitebirdz :

On Tue, Sep 01, 2009 at 11:38:30AM +0200, BJörn Lindqvist wrote:

Hello everybody,

I'm looking for a pure Python solution for converting word documents
to text. App Engine doesn't allow external programs, which means that
external programs like catdoc and antiword can't be used. Anyone know
of any?


A quick search returned this:

http://code.activestate.com/recipes/279003/


It requires windows.


I'm moderately confident that no (published) solution exists
for this without relying on an installed Word or an external
program of the kind you mentioned. Obviously, there's nothing
to stop someone creating a Python module which does the
equivalent, possibly by wrapping the core of the catdoc/antiword
code in a Python module or by recoding its functionality in
Python. But I imagine you knew that :)

If you were talking Excel, you'd be in luck thanks to the
sterling work done by John Machin and others. But I imagine
that the market for word doc interchange / conversion is
considerably smaller, especially within restricted environments.

Depending on the source of your docs, it would be possible to
save them as, eg, XML or something for which a converter is
available in Python. Even text-only, I suppose. But I suppose
that you're asking because that's not a possibility?

TJG
--
http://mail.python.org/mailman/listinfo/python-list


Re: Python word to text

2009-09-01 Thread Tino Wildenhain

Am 01.09.2009 13:42, schrieb Nitebirdz:

On Tue, Sep 01, 2009 at 11:38:30AM +0200, BJörn Lindqvist wrote:

Hello everybody,

I'm looking for a pure Python solution for converting word documents
to text. App Engine doesn't allow external programs, which means that
external programs like catdoc and antiword can't be used. Anyone know
of any?



A quick search returned this:

http://code.activestate.com/recipes/279003/


Did you give it a try?


Thats a funny advice. Did you read that receipe? ;-)
"Requires the Python for Windows extensions, and MS Word."
how does this match with "App Engine doesn't allow external programs"? :-)

For excel this would be easy but word - Björn, did you check google api
if you would be able to access google docs for this?

Regards
Tino



smime.p7s
Description: S/MIME Cryptographic Signature
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python word to text

2009-09-01 Thread BJörn Lindqvist
2009/9/1 Nitebirdz :
> On Tue, Sep 01, 2009 at 11:38:30AM +0200, BJörn Lindqvist wrote:
>> Hello everybody,
>>
>> I'm looking for a pure Python solution for converting word documents
>> to text. App Engine doesn't allow external programs, which means that
>> external programs like catdoc and antiword can't be used. Anyone know
>> of any?
>>
>
> A quick search returned this:
>
> http://code.activestate.com/recipes/279003/

It requires windows.


-- 
mvh Björn
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python word to text

2009-09-01 Thread Nitebirdz
On Tue, Sep 01, 2009 at 11:38:30AM +0200, BJörn Lindqvist wrote:
> Hello everybody,
> 
> I'm looking for a pure Python solution for converting word documents
> to text. App Engine doesn't allow external programs, which means that
> external programs like catdoc and antiword can't be used. Anyone know
> of any?
> 

A quick search returned this:

http://code.activestate.com/recipes/279003/


Did you give it a try?  


-- 
http://mail.python.org/mailman/listinfo/python-list


Python word to text

2009-09-01 Thread BJörn Lindqvist
Hello everybody,

I'm looking for a pure Python solution for converting word documents
to text. App Engine doesn't allow external programs, which means that
external programs like catdoc and antiword can't be used. Anyone know
of any?

Thanks in advance.


-- 
mvh Björn
-- 
http://mail.python.org/mailman/listinfo/python-list