Niklas Nebel wrote:
It fixes the issues with missing line breaks, but it does create new issues with unwanted/unhandled breaks. Some problem areas:
- Conversion to unformatted text, especially for clipboard or DDE links
- Other line-based formats (DIF, SYLK)
- Conversion of formulas to text cells (Paste Special, unselect Formulas)
- Text content in the file format (<text:p> versus <text:line-break>)

The only way to be sure is to go through all these GetString calls and see if the line feed character is handled correctly.

I've had a good look at many of these and have posted a new patch fixing various multiline problems. These have been a mixture of non-formula
cell problems that have always been there as well as further
fixes to multiline formula cells. These are in the SYLK, DIF,
HTML, unformatted text and Quattro Pro filters, details in the patch
http://www.openoffice.org/issues/show_bug.cgi?id=35913. It includes some
subtle changes which I hope are okay as they bring these non-Calc
formats in line with other spreadsheet programs, I've been looking
closely at Excel, Gnumeric and Quattro Pro. Excel is definitely the most
polished and I've mostly based compatibility on this program. Of
particular note is the unformatted text and SYLK quoting convention
change.

The biggest area for change though is DDE links and I need some help here before implementing them. Firstly, tabs within a cell are broken in the current versions of Calc and the problems are closely related to newline characters within cells. Excel deals with both tabs and newlines in cells and as this is a working solution, I'd like to know if there is anyone who can provide some information as to how it works. Somehow it is doing the impossible, here is why...

If a cell contains either a newline (\n) or a tab (\t), it escapes the entire contents with an opening and a closing quote ("). If a cell is quoted like this and it contains a quote character in the contents, then the quote is escaped by double quoting, ie " is replaced by "". Note that within cells, a newline is represented by \n, not \r\n, even though this is Windows. The end of a line, however, is designated by \r\n and cells are separated by \t. My latest patch has replicated this protocol when copying text. With this info in mind, consider two adjacent cells both containing three single quotes and another cell containing a tab within quotes, so visually where | indicates the division between cells, the contents are:

1) """|"""
2) "\t"

When copying or dde linking using unformatted text, we get the following for both:

"""\t"""\r\n

So it is impossible to distinguish these two sets of contents. However, Excel always distinguishing them correctly when dde linking, not pasting though. Initially I was wondering if it uses the 'item' information that comes alongside the dde data, eg "R1C1:R1C2" to help determine the number of rows/columns. However, this is not possible as it also deals with this case of having a fixed number of cells, as in this 3 cell case:


1) "\t"|"""|"""
2) """|"""|"\t"

both of which result in the following dde data:

"""\t"""\t"""\t"""\r\n

When simply copying into Excel, it does not always get it right, which I would expect. Also dde linking unformatted text from Word gives Excel problems, so the question is how does it solve it for dde linking, which contains the same textual data? I have a hunch it uses dde links using the SYLK format instead as when debugging paste linking unformatted text from Excel into Calc, a SYLK request arrives in addition to unformatted text. In my patch, I've fixed SYLK quoting, however, Calc's version of SYLK still does not match the standard approach used by Excel and I presume the original Multiplan, so I *think* the SYLK format is incorrect, so when dde linking to Excel from OOo, Excel doesn't get it right, but Excel to Excel it does.

I've arrived at a juncture. Firstly, does anyone have a good insight into all this? Secondly, assuming the dde links are done using SYLK, is it okay to change this in OOo to match?

Finally, how does this relate to adding in the newline support? Well, Calc uses \n as the line terminator on Unix and \r\n on Windows for unformatted text copy/paste and linking. If \n exists within cells and is escaped with quotes as on Windows, then the same problem arises as I showed above with tabs in not being able to determine if \n is the end of the line or a new line within a cell. That would mean for dde linking, \r\n would need to be used on Unix (\n is used at the moment), but this may not be such a surprise given that dde linking is a Windows protocol. I'm hoping the solution is to use SYLK for dde linking, but then the OOo SYLK format would need tweaking.

A couple more related queries...
- Does anyone know of any other unix DDE clients, in particular spreadsheets? If not the impact of changing the end of line terminator from \n to \r\n won't be so big. - Calc implies that it supports DDE links as a server using SYLK, DIF, HTML, etc in addition to plain text but in actual fact they all end up calling the text format. This can be observed by debugging impex.cpp when doing a copy from Calc and paste link in something like Excel. Is this a known quirk? There seem to be other DDE problems eg some of the paste link graphic formats into Word give errors. - It has been getting progressively harder to get a tab character into a cell, from 2.4 to DEV300_m16 to OOO300_m7. Is this an accident or are there some bug fixes related to this behaviour? I can't see any pattern, eg aa\tbb\n will keep the tab in OOO300_m7, but aa\tbb will not and aa\tbb will keep the tab in 2.4, but aa\tb will not.

The patch is getting rather big with numerous knock on fixes. Is this the sort of point in time that a child workspace should be created for it to ease development?

Apologies for long email, but it is all a great big can of worms!!

William

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to