https://bz.apache.org/ooo/show_bug.cgi?id=126901

--- Comment #2 from [email protected] ---
(In reply to orcmid from comment #1)
> (In reply to damjan from comment #0)
> > Make a CSV file containing:
> > +,123
> [ ... ]
> > It should be imported as the string "+,123" instead, as numbers cannot begin
> > with thousand separators.
> 
> What happens if you use quotes around the string (that is, literally
> "+,123") the normal means for determining that a cell value is meant to be
> taken as text and not anything else?
> 
> The import scheme for text (including CSV) is rather generous and attempts
> to handle more cases than standard CSV, which is why there are all those
> parameters.  It also tends to find numbers in any surround and use that
> number. So it accepts cases that Calc never produces in CSV output when text
> fields are quoted.  
> 
> Because the input filter has so many variations, it is necessary to delimit
> text fields so they import properly always when it is known that
> conventional Excel CSV is intended.  The input filter should then be set
> accordingly.
> 
> I assume that having the text filter work in the way it does was considered
> a feature, and the few kinds of CSVs that Calc produces were simply lumped
> into cases of this more "general" feature.
> 
> EXPERIMENTS
> 
> Importing that file into Excel 2016 with a US locale creates two fields, one
> with the text "+" and another with the numeral 123.  I don't know what
> happens in locales where the comma is a decimal separator.  Same as +.123?
> 
> Using ",123" (with the quotes) instead works fine directly in Excel and will
> work fine in OpenOffice Calc once the Option "Quoted field as text" is
> checked for the text import.

I did an even better test in Excel 2007, which doesn't offer any options when
importing CSV. What I did is type '+,123 into a cell (the ' is necessary to get
Excel to stop complaining about a wrong formula), and then used "Text to
columns" on that cell with space as the separator. It did not split the cell,
and it remained a string. In Calc it would be converted to a number.

> I am confirming the behavior.  Whether it is a defect or by design is a
> bigger question.

I discovered this bug by reading the code, and then coming up with this test
case that will reproduce it, so I suspect it's a defect. Method
ScStringUtil::parseSimpleNumber() in main/source/core/tool/stringutil.cxx, the
"else if (c == gsep)" case in the "for" loop attempts to fail parsing the
number if the thousand separator appears in wrong places, such as after the
decimal separator, or not 3 digits apart. One of the tests is whether the
thousand separator is "// not allowed as the first character", and performed by
checking whether "i == 0", which is not strict enough, because a '+' or a '-'
can be the first character, and the thousand separator will still be wrongly
placed if it appears after it and before any digits. The test should be "not
allowed before digits" instead.

-- 
You are receiving this mail because:
You are the assignee for the issue.

Reply via email to