Hi William, On Saturday, 2008-11-08 23:48:43 +0000, William S Fulton wrote:
> I've had a good look at many of these and have posted a new patch fixing
> various multiline problems.
This is great! I think that patch does it, other opinions?
> It includes some
> subtle changes which I hope are okay as they bring these non-Calc
> formats in line with other spreadsheet programs, I've been looking
> closely at Excel, Gnumeric and Quattro Pro. Excel is definitely the most
> polished and I've mostly based compatibility on this program. Of
> particular note is the unformatted text and SYLK quoting convention
> change.
Bringing SYLK closer to what Excel reads/writes is always good. I'm not
sure about the unformatted text, is the quoting convention now what
Excel writes to the clipboard if requested?
> The biggest area for change though is DDE links and I need some help
> here before implementing them. Firstly, tabs within a cell are broken in
> the current versions of Calc and the problems are closely related to
> newline characters within cells. Excel deals with both tabs and newlines
> in cells and as this is a working solution, I'd like to know if there is
> anyone who can provide some information as to how it works. Somehow it
> is doing the impossible, here is why...
>
> If a cell contains either a newline (\n) or a tab (\t), it escapes the
> entire contents with an opening and a closing quote ("). If a cell is
> quoted like this and it contains a quote character in the contents, then
> the quote is escaped by double quoting, ie " is replaced by "".
So far the CSV conventions as if tab instead of comma was used as field
separator (TSV).
> Note
> that within cells, a newline is represented by \n, not \r\n, even though
> this is Windows. The end of a line, however, is designated by \r\n and
> cells are separated by \t. My latest patch has replicated this protocol
> when copying text. With this info in mind, consider two adjacent cells
> both containing three single quotes and another cell containing a tab
> within quotes, so visually where | indicates the division between cells,
> the contents are:
>
> 1) """|"""
> 2) "\t"
>
> When copying or dde linking using unformatted text, we get the following
> for both:
>
> """\t"""\r\n
>
> So it is impossible to distinguish these two sets of contents.
Following CSV rules, quotes would be escaped by doubling them. This
would give:
1) """"""""\t""""""""
2) """\t"""
But that does not seem to be what Excel delivers with DDE?
> However,
> Excel always distinguishing them correctly when dde linking, not pasting
> though. Initially I was wondering if it uses the 'item' information that
> comes alongside the dde data, eg "R1C1:R1C2" to help determine the
> number of rows/columns. However, this is not possible as it also deals
> with this case of having a fixed number of cells, as in this 3 cell case:
>
>
> 1) "\t"|"""|"""
> 2) """|"""|"\t"
>
> both of which result in the following dde data:
>
> """\t"""\t"""\t"""\r\n
>
> When simply copying into Excel, it does not always get it right, which I
> would expect. Also dde linking unformatted text from Word gives Excel
> problems, so the question is how does it solve it for dde linking, which
> contains the same textual data?
I think it simply uses a different protocol when linking between two
Excel documents, as you noticed and mentioned in your other mail.
> I have a hunch it uses dde links using
> the SYLK format instead as when debugging paste linking unformatted text
> from Excel into Calc, a SYLK request arrives in addition to unformatted
> text. In my patch, I've fixed SYLK quoting, however, Calc's version of
> SYLK still does not match the standard approach used by Excel and I
> presume the original Multiplan, so I *think* the SYLK format is
> incorrect, so when dde linking to Excel from OOo, Excel doesn't get it
> right, but Excel to Excel it does.
>
> I've arrived at a juncture. Firstly, does anyone have a good insight
> into all this? Secondly, assuming the dde links are done using SYLK, is
> it okay to change this in OOo to match?
No, not unconditionally. You'd have to somehow distinguish between
a Calc-Calc link and other links that only understand unformatted text,
in sending or receiving.
> Finally, how does this relate to adding in the newline support? Well,
> Calc uses \n as the line terminator on Unix and \r\n on Windows for
> unformatted text copy/paste and linking. If \n exists within cells and
> is escaped with quotes as on Windows, then the same problem arises as I
> showed above with tabs in not being able to determine if \n is the end
> of the line or a new line within a cell. That would mean for dde
> linking, \r\n would need to be used on Unix (\n is used at the moment),
> but this may not be such a surprise given that dde linking is a Windows
> protocol.
Indeed. I'm also not aware of any application other than OOo supporting
DDE on other platforms, except OS/2 that has the same line end
convention.
> A couple more related queries...
> - Does anyone know of any other unix DDE clients, in particular
> spreadsheets? If not the impact of changing the end of line terminator
> from \n to \r\n won't be so big.
> - Calc implies that it supports DDE links as a server using SYLK, DIF,
> HTML, etc in addition to plain text
Erm.. what are you referring with "Calc implies that it supports"?
> but in actual fact they all end up
> calling the text format. This can be observed by debugging impex.cpp
> when doing a copy from Calc and paste link in something like Excel. Is
> this a known quirk? There seem to be other DDE problems eg some of the
> paste link graphic formats into Word give errors.
AFAIK it was not intended to support anything else than cell content
with DDE.
> - It has been getting progressively harder to get a tab character into a
> cell, from 2.4 to DEV300_m16 to OOO300_m7. Is this an accident or are
> there some bug fixes related to this behaviour? I can't see any pattern,
> eg aa\tbb\n will keep the tab in OOO300_m7, but aa\tbb will not and
> aa\tbb will keep the tab in 2.4, but aa\tb will not.
I don't see any difference in behavior with OOO300_m9, given that a tab
can be entered to a cell only when copied from the clipboard. Or what
was your attempt?
> The patch is getting rather big with numerous knock on fixes. Is this
> the sort of point in time that a child workspace should be created for
> it to ease development?
Maybe the best, if this tends to last longer, for rebasing or other
developers to jump in. Do you want me to create one?
> Apologies for long email, but it is all a great big can of worms!!
Indeed..
Eike
--
OOo/SO Calc core developer. Number formatter stricken i18n transpositionizer.
SunSign 0x87F8D412 : 2F58 5236 DB02 F335 8304 7D6C 65C9 F9B5 87F8 D412
OpenOffice.org Engineering at Sun: http://blogs.sun.com/GullFOSS
Please don't send personal mail to the [EMAIL PROTECTED] account, which I use
for
mailing lists only and don't read from outside Sun. Use [EMAIL PROTECTED]
Thanks.
pgpcaCEsQUQwq.pgp
Description: PGP signature
