Hi William, On Saturday, 2008-11-08 23:48:43 +0000, William S Fulton wrote:
> I've had a good look at many of these and have posted a new patch fixing > various multiline problems. This is great! I think that patch does it, other opinions? > It includes some > subtle changes which I hope are okay as they bring these non-Calc > formats in line with other spreadsheet programs, I've been looking > closely at Excel, Gnumeric and Quattro Pro. Excel is definitely the most > polished and I've mostly based compatibility on this program. Of > particular note is the unformatted text and SYLK quoting convention > change. Bringing SYLK closer to what Excel reads/writes is always good. I'm not sure about the unformatted text, is the quoting convention now what Excel writes to the clipboard if requested? > The biggest area for change though is DDE links and I need some help > here before implementing them. Firstly, tabs within a cell are broken in > the current versions of Calc and the problems are closely related to > newline characters within cells. Excel deals with both tabs and newlines > in cells and as this is a working solution, I'd like to know if there is > anyone who can provide some information as to how it works. Somehow it > is doing the impossible, here is why... > > If a cell contains either a newline (\n) or a tab (\t), it escapes the > entire contents with an opening and a closing quote ("). If a cell is > quoted like this and it contains a quote character in the contents, then > the quote is escaped by double quoting, ie " is replaced by "". So far the CSV conventions as if tab instead of comma was used as field separator (TSV). > Note > that within cells, a newline is represented by \n, not \r\n, even though > this is Windows. The end of a line, however, is designated by \r\n and > cells are separated by \t. My latest patch has replicated this protocol > when copying text. With this info in mind, consider two adjacent cells > both containing three single quotes and another cell containing a tab > within quotes, so visually where | indicates the division between cells, > the contents are: > > 1) """|""" > 2) "\t" > > When copying or dde linking using unformatted text, we get the following > for both: > > """\t"""\r\n > > So it is impossible to distinguish these two sets of contents. Following CSV rules, quotes would be escaped by doubling them. This would give: 1) """"""""\t"""""""" 2) """\t""" But that does not seem to be what Excel delivers with DDE? > However, > Excel always distinguishing them correctly when dde linking, not pasting > though. Initially I was wondering if it uses the 'item' information that > comes alongside the dde data, eg "R1C1:R1C2" to help determine the > number of rows/columns. However, this is not possible as it also deals > with this case of having a fixed number of cells, as in this 3 cell case: > > > 1) "\t"|"""|""" > 2) """|"""|"\t" > > both of which result in the following dde data: > > """\t"""\t"""\t"""\r\n > > When simply copying into Excel, it does not always get it right, which I > would expect. Also dde linking unformatted text from Word gives Excel > problems, so the question is how does it solve it for dde linking, which > contains the same textual data? I think it simply uses a different protocol when linking between two Excel documents, as you noticed and mentioned in your other mail. > I have a hunch it uses dde links using > the SYLK format instead as when debugging paste linking unformatted text > from Excel into Calc, a SYLK request arrives in addition to unformatted > text. In my patch, I've fixed SYLK quoting, however, Calc's version of > SYLK still does not match the standard approach used by Excel and I > presume the original Multiplan, so I *think* the SYLK format is > incorrect, so when dde linking to Excel from OOo, Excel doesn't get it > right, but Excel to Excel it does. > > I've arrived at a juncture. Firstly, does anyone have a good insight > into all this? Secondly, assuming the dde links are done using SYLK, is > it okay to change this in OOo to match? No, not unconditionally. You'd have to somehow distinguish between a Calc-Calc link and other links that only understand unformatted text, in sending or receiving. > Finally, how does this relate to adding in the newline support? Well, > Calc uses \n as the line terminator on Unix and \r\n on Windows for > unformatted text copy/paste and linking. If \n exists within cells and > is escaped with quotes as on Windows, then the same problem arises as I > showed above with tabs in not being able to determine if \n is the end > of the line or a new line within a cell. That would mean for dde > linking, \r\n would need to be used on Unix (\n is used at the moment), > but this may not be such a surprise given that dde linking is a Windows > protocol. Indeed. I'm also not aware of any application other than OOo supporting DDE on other platforms, except OS/2 that has the same line end convention. > A couple more related queries... > - Does anyone know of any other unix DDE clients, in particular > spreadsheets? If not the impact of changing the end of line terminator > from \n to \r\n won't be so big. > - Calc implies that it supports DDE links as a server using SYLK, DIF, > HTML, etc in addition to plain text Erm.. what are you referring with "Calc implies that it supports"? > but in actual fact they all end up > calling the text format. This can be observed by debugging impex.cpp > when doing a copy from Calc and paste link in something like Excel. Is > this a known quirk? There seem to be other DDE problems eg some of the > paste link graphic formats into Word give errors. AFAIK it was not intended to support anything else than cell content with DDE. > - It has been getting progressively harder to get a tab character into a > cell, from 2.4 to DEV300_m16 to OOO300_m7. Is this an accident or are > there some bug fixes related to this behaviour? I can't see any pattern, > eg aa\tbb\n will keep the tab in OOO300_m7, but aa\tbb will not and > aa\tbb will keep the tab in 2.4, but aa\tb will not. I don't see any difference in behavior with OOO300_m9, given that a tab can be entered to a cell only when copied from the clipboard. Or what was your attempt? > The patch is getting rather big with numerous knock on fixes. Is this > the sort of point in time that a child workspace should be created for > it to ease development? Maybe the best, if this tends to last longer, for rebasing or other developers to jump in. Do you want me to create one? > Apologies for long email, but it is all a great big can of worms!! Indeed.. Eike -- OOo/SO Calc core developer. Number formatter stricken i18n transpositionizer. SunSign 0x87F8D412 : 2F58 5236 DB02 F335 8304 7D6C 65C9 F9B5 87F8 D412 OpenOffice.org Engineering at Sun: http://blogs.sun.com/GullFOSS Please don't send personal mail to the [EMAIL PROTECTED] account, which I use for mailing lists only and don't read from outside Sun. Use [EMAIL PROTECTED] Thanks.
pgpcaCEsQUQwq.pgp
Description: PGP signature