We used Automation with MS Word succesfully in a server condition.
However, we had to make a few design decisions:

1) Due to the fact that Word is written as a UI-application you must be
prepared that messageboxes might occur, stopping your application. We wrote
code to auto-answer some of those boxes, and cancelled the conversion if we
didn't know what to answer.

2) If you want to load many documents in a loop you will end with a crashed
application...
We used Windows to do the cleanup by writing a helper-application that does
the Word automation. The server-app then starts this helper-app for each
document that must be converted.
The server is able to set a timeout for the conversion and cancel helper-app
if it lasts too long. The helper-app itself can listen to an Win32-event to
kill itself.

Succes,
Peter v/d Weerd


----- Original Message -----
From: "Steve Welborn" <[EMAIL PROTECTED]>
To: <ADVANCED-DOTNET@DISCUSS.DEVELOP.COM>
Sent: Monday, December 11, 2006 2:55 PM
Subject: Re: [ADVANCED-DOTNET] Does anyone know how to read a Word document
in .Net 2003?


You could try the Automation/Server idea, MS makes it
easy to use, but like
most here I've had nothing but nightmares with it.
Automation with Word is a
memory hog, majority of the time the instances still
remain in memory
despite whatever measure you take to close it and not
to mention the crash's
that have or could occur.

But from what you described your use to be I would
probably go with a
Service as well. I would just be sure to double check
to get it out of
memory when done.


Good luck.
Steve

-----Original Message-----
From: Discussion of advanced .NET topics.
[mailto:[EMAIL PROTECTED] On Behalf
Of Marc Brooks
Sent: Monday, December 11, 2006 12:22 AM
To: ADVANCED-DOTNET@DISCUSS.DEVELOP.COM
Subject: Re: [ADVANCED-DOTNET] Does anyone know how to
read a Word document
in .Net 2003?

On 12/10/06, Jon Rothlander <[EMAIL PROTECTED]>
wrote:
I think that is what I want to do.  I just want
something that will
convert
it to text.  I was just thinking that if in a .Net
app you can easily open
the Word doc and the save it back out as a Text
file...

Having been there, done that, and regretted it, let me
share.  I
worked on a project[1] that used to extract resumes in
Word/Word
Perfect/etc. documents via automation so we could pass
them through an
expert system to extract the information. The WinWord
process
constantly crashed and locked the service.

Eventually, after trying several commercial conversion
tools
(including several supposed to be used in batch
conversion or
server-based setups), nothing was working.

Then I hit on the radical idea that "if it's good
enough for
index-server[2], it's good enough for me" and used the
installed
IFilter drivers to suck out the text of any file we
had an IFilter
driver (and dude, are there tons of them available for
free). I wrote
a little COM component in C++ that simply defers to
the shell to load
the correct driver and then ignored all the
"formatting" information
and kept the text, which is returned as a BSTR.
Optionally, you can
ask it to "clean the text" to normalize the Unicode
encodings and
morphing digits-like characters to actual digits

If you are interested, I can post the source for
this... it is still
in service to  this day and it really works well.

[1] http://www.sendouts.com
[2]
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/indexsrv/hh
/indexsrv/ixufilt_94fm.asp

IFilters:
http://www.adobe.com/support/downloads/8122.htm
http://www.corel.com/support/ftpsite/pub/wordperfect/wpwin/8/cwps8.htm#
http://www.adobe.com/support/downloads/8126.htm
http://www.cad-company.nl/ifilter/
http://www.microsoft.com/sharepoint/techinfo/reskit/RTF_Filter.asp
http://www.microsoft.com/sharepoint/techinfo/reskit/XML_Filter.asp
http://www.naa.gov.au/Search/srchadm/help/default.htm#Top
http://www.mp3machine.com/software/MP3_Ifilter/=

--
"I am Dyslexic of Borg. Resistors are fertile. Prepare
to have your
ass laminated." -- Dan Nitschke

Marc C. Brooks
http://musingmarc.blogspot.com

===================================
This list is hosted by DevelopMentorR
http://www.develop.com

View archives and manage your subscription(s) at
http://discuss.develop.com




____________________________________________________________________________________
Do you Yahoo!?
Everyone is raving about the all-new Yahoo! Mail beta.
http://new.mail.yahoo.com

===================================
This list is hosted by DevelopMentor®  http://www.develop.com

View archives and manage your subscription(s) at
http://discuss.develop.com


===================================
This list is hosted by DevelopMentor®  http://www.develop.com

View archives and manage your subscription(s) at http://discuss.develop.com

Reply via email to