On Thu, Jan 8, 2015 at 5:38 PM, Robert Helling <[email protected]> wrote:
> Hi, > > On 05.01.2015, at 20:24, Miika Turkia <[email protected]> wrote: > > Signed-off-by: Miika Turkia <[email protected]> > --- > Documentation/user-manual.txt | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/Documentation/user-manual.txt b/Documentation/user-manual.txt > index 9cbc4cc..e7c5d96 100644 > --- a/Documentation/user-manual.txt > +++ b/Documentation/user-manual.txt > @@ -1094,7 +1094,7 @@ an introduction to CSV-formatted files see > xref:S_CSV_Intro[A Diver's Introducti > [icon="images/icons/important.png"] > [IMPORTANT] > The CSV import has a couple of caveats. You should avoid some special > characters > -like ampersand (&) and double quotes ("), the latter if quoting text > cells. The > +like ampersand (&), less than (<), greater than (>) and double quotes > ("), the latter if quoting text cells. The > file should use UTF-8 character set, if having non-ASCII characters. Also > the > size of the CSV file might cause problems. Importing 100 dives at a time > (without dive profile) has worked previously, but larger files might exceed > > > (sorry again for being so late) > > of course there could be millions of sources of csv-files (an many of them > broken in the sense that they produce non-parable output), but the issue is > not completely hopeless: I just had a look at what LibreOffice does when > asked to save a spread sheet with challenging characters as a CSV and it > does as attached > > > . As you can see, double quotes (up and down) are a different character > than the one used as a field separator > > 00000000: 2254 6869 7320 6365 6c6c 2063 6f6e 7461 "This cell conta > 00000010: 696e 7320 6368 616c 6c65 6e67 696e 6720 ins challenging > 00000020: 6368 6172 6163 7465 7273 206c 696b 6520 characters like > 00000030: 7175 6f74 6573 2064 6f77 6e20 e280 9e20 quotes down ... > 00000040: 616e 6420 7570 e280 9c2c 2063 6f6d 6d61 and up..., comma > 00000050: 732c 2061 706f 7374 726f 7068 7320 2720 s, apostrophs ' > 00000060: 616e 640a 4e65 7720 6c69 6e65 732e 222c and.New lines.", > 00000070: 5468 6973 2069 7320 7468 6520 6e65 7874 This is the next > 00000080: 2063 656c 6c2e 0a cell.. > > When the apostroph is used as a field separator and it appears inside the > cell, it is just repeated. > > I have no idea how general are these rules (wikipedia says quotation > repetition is in an RFC) but maybe we should support them. What is the > reason for warning about xml-special characters like & and <>? > CSV is parsed with XSLT as XML, so the XML specific characters are therefor an issue. And that is also a reason why any special cases are quite tricky. At least I am not that fluent in XSLT to be properly able to take the special cases into account. Quotation should (hopefully) be done properly, but multi-line I cannot promise to work that well. miika
_______________________________________________ subsurface mailing list [email protected] http://lists.subsurface-divelog.org/cgi-bin/mailman/listinfo/subsurface
