Re: [Pharo-users] Automation of MS Office from Pharo

PBKResearch Tue, 07 Apr 2020 10:29:10 -0700

Hello Tomaž


I have been exploring today, to try to narrow down where the problem appears. 
It’s difficult to give you some code to reproduce the problem, because you 
would need to have my Outlook Data File, or an extract of it, plus MS Outlook 
to run it on. If you want that, I will try to work out how to produce a minimum 
extract to show the problem.

 

Meanwhile, I can describe what the problem seems to be. All my work has been 
done in a playground. I can set up a connection to my Outlook Data File and 
navigate through it to the folder of interest, which contains editions of a 
newsletter in German. I then inspect one of the items in the folder, which is a 
COMDispatchInstance and an instance of the object MailItem. My interest is the 
property HTMLBody of the MailItem, which is as you expect the complete code to 
display the newsletter. If I inspect the HTMLBody property for most MailItems, 
it displays with no problem. In some cases, however, as soon as I click 
‘inspect it’, there is a message saying the VM has crashed, and the system 
closes with a crash dump.

 

I don’t think there is a problem with the fact that I am displaying the 
property value in an inspector. I tried instead to assign it to a variable in 
the playground. As soon as I clicked ‘do it’ the whole Pharo app closed down, 
with no message or crash dump.

 

I have tried to find some common factor in the items which will cause the 
crash, and the one thing I can find is the size of HTMLBody, or equivalently 
the size of the MailItem. The HTMLBody displays as a WideString in the 
inspector, and it works fine for an object of about 230K characters. With 
another message about 10% larger, the VM crash occurred.

 

I have no knowledge of COM interfacing, except what I have picked up in the 
last two days, but I have been trying to find a general description of a 
problem which would give the results I have seen. If a COM object has a 
property which is a BSTRString, and if the string is a WideString with more 
than about 250K characters, is there some internal buffer overflow? – the crash 
dump has several mentions of ‘Stack Overflow’, though I can’t really follow it.

 

That is about all I can explain – sorry it’s a bit long-winded. If you want 
more information, or an extract of my Outlook data file, just let me know.

 

Thanks for your help.

 

Peter Kenny

 

 

 

From: Pharo-users <pharo-users-boun...@lists.pharo.org> On Behalf Of Tomaž Turk
Sent: 07 April 2020 16:30
To: Any question about pharo is welcome <pharo-users@lists.pharo.org>
Subject: Re: [Pharo-users] Automation of MS Office from Pharo

 

Dear Peter,

 

I'm glad that you find the package useful! Can you please explain a little bit 
more about your code, because it's not obvious from your description at which 
stage the problem occurs. Could you trace it with the debugger, and/or possibly 
post here the piece of code that fails? 

 

Thanks and best wishes,

Tomaz

 

------ Original Message ------

From: "PBKResearch" <pe...@pbkresearch.co.uk <mailto:pe...@pbkresearch.co.uk> >

To: "Any question about pharo is welcome" <pharo-users@lists.pharo.org 
<mailto:pharo-users@lists.pharo.org> >

Sent: 6. 04. 2020 19:15:16

Subject: Re: [Pharo-users] Automation of MS Office from Pharo

 

Hello Pablo (and Tomaž)

 

You asked for feedback; after a day of experiments, I have good news and bad 
news.

 

The good news is that I have been able to navigate through my Outlook Data 
File, find my way to an e-mail of interest and examine the fields of interest. 
This means I am 90% of the way to my objective. All I want is to get the HTML 
code from the message body and pass it to XMLHTMLParser for further analysis. I 
found it not too difficult to work this out; the fact that a 
COMDispatchInstance can display all its internals in a playground page meant 
that I could work it out interactively as a playground exercise. I am still 
discovering things about the interface; I have only just found out that I can 
select an item from a list by quoting its name as an argument, rather than its 
number in the list. I am sure my clumsy code can be tidied up still.

 

I can get to the 100% point in the job by exporting the contents of the e-mail 
to disk in HTML format, using the SaveAs procedure of the MailItem object. I 
can then read it back from disk into the parser. But this seems a bit clunky. 
The content of an HTML mail item is available in the HTMLBody property of the 
MailItem, so I should be able to pass it programmatically to the parser without 
going near the disk. This is where I run into trouble.

 

In most of the cases I tried, this worked fine; I could display the HTML body 
text as a WideString in the playground, assign it to a variable and do whatever 
I wanted. However, for just one e-mail I tried, as soon as I selected ‘Do it 
and go’ in the playground, a message came up that the VM had crashed; the 
application closed, leaving only a crash dump. I cannot find anything unusual 
about the message that failed in this way, except that when I save it, it 
produces a larger file than any of the others (just over 1MB, against up to 
300KB for the ones that worked). Could there be a limit on the size of some 
internal buffer?

 

I thought it worth while mentioning this now, because crashing the VM is 
generally undesirable. I can solve my problems safely using the SaveAs route, 
so it’s not a problem for me.

 

Hope this helps; if you want more details, let me know. Overall I am very happy 
with this library.

 

Peter Kenny

 

From: Pharo-users <pharo-users-boun...@lists.pharo.org 
<mailto:pharo-users-boun...@lists.pharo.org> > On Behalf Of teso...@gmail.com 
<mailto:teso...@gmail.com> 
Sent: 06 April 2020 11:24
To: Any question about pharo is welcome <pharo-users@lists.pharo.org 
<mailto:pharo-users@lists.pharo.org> >
Subject: Re: [Pharo-users] Automation of MS Office from Pharo

 

Hi Peter,

    First, thanks to try to use Pharo-COM, that is great, I love to have users 
for it and find it is useful. 

Secondly, the Pharo 7 problem is an error I have introduced. It is clear that 
some changes I have done to support the new version of UFFI (the framework in 
Pharo to handle FFI calls) have broken the Pharo 7 version, so I will fix it to 
maintain compatibility.

 

It is great that you were able to make it work!

 

Cheers,

 

 

On Mon, Apr 6, 2020 at 12:18 PM PBKResearch <pe...@pbkresearch.co.uk 
<mailto:pe...@pbkresearch.co.uk> > wrote:

Hello Tomaž

 

Many thanks for your patient explanation. I should have been able to work out 
for myself how the Word test works, and indeed I realized some of it when I 
came to shut down the Word instance and its two documents. But it was late at 
night, and I should have packed up before then.

 

My last test last night showed that basically I have cracked it for my job of 
automating Outlook. I have been able to connect to my running instance of 
Outlook, open my application and interrogate the names of my top-level folders. 
From now on it should be just a matter of understanding the MS documentation of 
the Outlook model.

 

However, all this is with Pharo-Com installed in a new Pharo 8 image. I have no 
idea what went wrong with my first effort on Pharo 7. But I shan’t worry about 
that – I shall gradually move all my bits and pieces to P8. I shall try to work 
it out myself from here, but I shall come back if I get stuck.

 

Thanks again

 

Peter Kenny

 

From: Pharo-users <pharo-users-boun...@lists.pharo.org 
<mailto:pharo-users-boun...@lists.pharo.org> > On Behalf Of Tomaž Turk
Sent: 06 April 2020 08:04
To: Any question about pharo is welcome <pharo-users@lists.pharo.org 
<mailto:pharo-users@lists.pharo.org> >
Subject: Re: [Pharo-users] Automation of MS Office from Pharo

 

Hello Peter,

 

If you look at the code in the Word test you will notice that the test firstly

- creates a new Word instance, 

- makes it visible to the end user, 

- then adds an empty document to the documents collection with the text "Hello 
from Pharo!"

- then it tests whether it can receive the same text back from Word.

 

After that, the test

- adds a new empty document to the documents collection with the text "Hello 
from Pharo! Some additional text. ", this time as an array of two texts

- it activates this second document (this imitates the end user's window 
activation on the desktop)

- then it tests whether it can receive the same text back from Word.

 

If you look at the Task Manager, you'll notice that you have one Word process 
with two open documents:

 



 

Namely, for each document Word creates a new, separate window - the documents 
are not displayed in one "Word application window", but separately - that's a 
normal behavior for some versions of MS Office, and it happens also if you open 
several documents directly in Word. So, there is just one Word instance.

 

'finalize' clears the references to the Word instance, it doesn't close the 
program by itself. If you want to do that, you can send Quit message to Word 
before you destroy the reference 
(https://docs.microsoft.com/en-us/office/vba/api/word.application.quit(method)).
 

 

Similar behaviour is with Outlook, here's one example: 
https://www.excelcommand.com/excel-help/excel-how-to.php?i=124116

 

The calling among COM objects is asynchronous, and it's usually wise to wrap it 
in error handling structures.

 

Please tell us how it goes.

 

Best wishes,

Tomaz

 

 

 

 

 

 

 

 

------ Original Message ------

From: "PBKResearch" <pe...@pbkresearch.co.uk <mailto:pe...@pbkresearch.co.uk> >

To: "Any question about pharo is welcome" <pharo-users@lists.pharo.org 
<mailto:pharo-users@lists.pharo.org> >

Sent: 5.4.2020 23:18:02

Subject: Re: [Pharo-users] Automation of MS Office from Pharo

 

Pablo - a final update before I close for the night. The Word test on the pharo 
8 version comes up green. The strange error message is nowhere to be seen in 
any Pharo 8 runs. The result is not what I expected; I finish up with two Word 
documents open, one with the first message, the other with the two messages. I 
thought the 'finalize' command would close it down.

 

Anyway, it looks as if I need to switch to P8 to use Pharo-Com. I shall 
continue testing tomorrow on P8.

 

Sorry for the late-night hassle.

 

Peter

 




 

-- 

Pablo Tesone.
teso...@gmail.com <mailto:teso...@gmail.com>

Re: [Pharo-users] Automation of MS Office from Pharo

Reply via email to