Re: [ADVANCED-DOTNET] Does anyone know how to read a Word document in .Net 2003?

Jon Rothlander Mon, 11 Dec 2006 20:15:31 -0800

I really appreciate all of the discussion on this topic and the many great
ideas.  I have taken each suggestion and dug into it in depth.


Are there any issues if I just do a rename of the word doc from file.doc to
file.txt, then open the file as a text document and parse if for the data I
need?  I know that the Word document format is not in strait ASCII text, but
it appears that the data itself is.

I've opened the file and there's a lot of garbage here from Word, but the
data I need is just sitting there as text.  I wrote a simple parser to read
the file and remove the extra characters that would cause problems... mainly
chr(0), as it seems to be interpreted by the stream reader as an end-of-file
character, but reading it like this seems to work file.

I'm reading for tags such as firstname:, lastname:, etc.  They are all there
and I really could care less about all of the Word stuff in the document.  I
just need the textual data.

Does anyone see any problems with this approach?  Just read it as a .TXT
file and pull out the data.

Jon

===================================
This list is hosted by DevelopMentor®  http://www.develop.com

View archives and manage your subscription(s) at http://discuss.develop.com

Re: [ADVANCED-DOTNET] Does anyone know how to read a Word document in .Net 2003?

Reply via email to