thanks always here to help On 11/27/06, Bob Gailer <[EMAIL PROTECTED]> wrote:
Bokverket wrote: > I did program a lot in VB's earlier versions, but it has grown... My > reason > for not considering VB was that the actual processing would make excellent > use of the Python collection objects /dictionaries/, which in my mind would > hold words of the Microsoft Word document. Are you aware that VBA with the scripting runtime offers dictionary object almost identical to Python's dict? Here is a VB Sub that counts the words in the document: Sub test() Dim d As Document, w As Variant, w2 As String Set dict = CreateObject("Scripting.Dictionary") Set d = Documents(1) For Each w In d.Range().Words w2 = Trim(w) If dict.Exists(w2) Then dict(w2) = dict(w2) + 1 Else dict(w2) = 0 End If Next End Sub For my test document of about 21000 occurrences of 120 words this took about 5 seconds. The Python equivalent takes 0.15 seconds. >>> import time >>> import win32com.client >>> a = win32com.client.Dispatch("word.application") >>> d = a.Documents(1) # wrap the process to get the text from the document, split it into words and build the dictionary >>> def f(): ... t=time.time() ... s=d.Range().Text ... w=s.split() ... wd={} ... for i in w: ... wd[i]=wd.setdefault(i,0)+1 ... print time.time()-t ... >>> f() 0.15700006485 > (The app's purpose is to analyze words of possibly very large Word documents.) Plus I suppose that a macro which would loop with a few lines over each word of the doc will be slow, although I don't if there is a compiling or byte-code mechanism. Am I wrong? > > I don't know if having a VB as glue to shelling Python is perfectly fine > performance-wise, and it certainly would be a simple way to handle the > dialog boxes that collect the parameters. Maybe that is a much better way > than wondering about calling Python /shelling, calling a DLL, whatever/ > directly. > > Next question: Is Microsoft Word's API for Python published like for VB and > easy to use? > Word has one API. It is what is published for VB. Your Python program would use win32com.client to launch Word as a COM server, then interact with it the same as a VB program (well, almost the same). For this you need pywin32 http://sourceforge.net/projects/pywin32/. import win32com.client application = win32com.client.Dispatch("word.application") # application is the same as the application object you see at the top of the word DOM in VB. document = application.Documents.Add() # to create a new document OR .Documents.Open(filename) to open an existing document. # OR if word is already running you can access an existing document using .Documents(indexno OR name) # how is different from VB? objects do not have default properties. Must be explicit. No set statement. Functions and subs must have the () appended. Hope that's enough to get you started. Since your goal seems to be text processing I'd think you'd want to read the entire document text into a Python string, then manipulate that. text = document.Range().Text will get all the text of the document body. (excludes header/footer). Note that paragraph breaks are \r, and that table cells end in \r\x07. -- Bob Gailer 510-978-4454 _______________________________________________ Python-win32 mailing list Python-win32@python.org http://mail.python.org/mailman/listinfo/python-win32
-- http://www.goldwatches.com
_______________________________________________ Python-win32 mailing list Python-win32@python.org http://mail.python.org/mailman/listinfo/python-win32