At 12:57:50 +0100, Sat, 17 Mar 2001 Bart Lateur wrote:

>On Sat, 17 Mar 2001 11:17:10 +0000, Alan Fry wrote:
>  >So we really need to find a better way of isolating 'TEXT', 'styl'
>>(and others like 'PICT' for example) from the string returned by
>>LMGetScrapHandle. It must be set out somewhere in IM: the question is
>>where.
>
>I'm too tired and too busy right now to go hunting for it, but I'm
>pretty sure that somewhere in that data, there must be length fields,
>that indicate how long the individual subsections are. Regexes shouldn't
>even begin to come into play.

No, you have misunderstood the nature of the problem.

In the clipboard the characters 'styl' precede a 'long' integer which 
contains the number of bytes to follow to make up the 'styl' record. 
'TEXT' and 'PICT' records are treated similarly.

The first two bytes of a 'styl' record comprise a 'short' which 
specifies the number of individual styles to be found in the record.

Each "style" comprises a block of 20 bytes which can be unpacked into 
ten items of information e.g. Font ID, Font size, Font style and so 
forth.

All that much is known. However the 'chunk' in question, that is the 
characters 'styl' followed by the prescribed number of bytes, is all 
buried in a mass of other stuff.

In the case of TE based applications the situation is fairly simple, 
but others such as 'WASTE' based applications (TexEdit, Shuck etc.) 
add their own idiosyncratic records as well. Microsoft 'Word' in 
particular dumps a huge quantity of stuff onto the clipboard merely 
to say "hello" in plain unadorned letters.

So the problem devolves to one of isolating the sub-strings 'styl' 
and 'TEXT' (and possibly also 'PICT') plus their appropriate bytes 
from all the mush in which they are embedded in no consistent order.

If Bart when he is not so tired and busy ;) can suggest a way of 
doing this without recourse to 'regexes' we would all be grateful.

Alan Fry

Reply via email to