Bug ID: 117672
Summary: Make default column type 'Text' when importing CSVs,
or warn of data modification
Version: 22.214.171.124 release
OS: Linux (All)
With the default CSV import settings, column types are set to 'Standard.' This
means that columns that contain very long numbers (such as serial numbers) get
converted to notation, which can result in data loss due to rounding.
The principal of least surprise would indicate that, if I open a CSV,
LibreOffice will not make changes to it that I haven't explicitly requested.
CSVs do not have a concept of data types -- every field in a CSV is 'text'. If
LibreOffice tries to infer meaning based on the content of these text fields,
it will get it wrong some of the time, resulting in unexpected behaviour.
The current solution is to select all the cells in the CSV import window and
change 'Standard' to 'Text.' Since all of the data in a CSV is text, I would
argue that making 'Text' the default column type is more correct and would
result in less unexpected behaviour, at least for the use cases I have in mind
(opening a CSV, making a minor modification, saving and closing it.)
If it is felt that the current behaviour is preferable for most of your users,
an alternative solution would be to provide the user with some warning that
LibreOffice is about to modify your data. This could take the form of a
dialogue box that appears after you click 'OK' in the Text Import window. This
new dialogue box could feature a Cancel button, allowing a user to go back to
the Text Import window and change their settings, and could perhaps include a
recommendation that the user change their column types to 'Text' if they don't
want LibreOffice to modify their data.
Silently changing data by default is the wrong behaviour, in my opinion.
Imagine a user working with a large sheet that has long numbers in column Z
that don't immediately appear on screen and aren't relevant to the user, so
they probably won't bother looking at them. They make their changes to the
cells that are relevant to them (say column A or B,) save and send the csv
around to their colleagues, never realizing that LibreOffice has modified part
their data. Now they look silly because they didn't predict LO's default
behaviour. Or they try to import it into the database using column Z (pretend
they're serial numbers) as a key. If those serial numbers are actually serial,
the same record might get updated repeatedly due to the rounding. Or some
records with sufficiently similar keys might get updated multiple times. Or
something else -- at this point we are well in to the realm of unexpected,
Far better just to not make changes when they aren't asked for. Or at least
warn when LO is about to make those changes so the user can prevent it from
doing so by changing the import settings.
Steps to Reproduce:
1.echo 12345678901234567890 > test.csv
2.vi test.csv -- notice the number, as echoed
3.open test.csv in lo calc using the default settings
5.vi test.csv -- notice that the number has changed, despite not having
explicitly done anything to the file and despite receiving no warning that my
csv file will essentially be corrupted if I save
The cell has been changed to 1.23456789012346E+019
The cell should appear as 12345678901234567890
User Profile Reset: No
Build ID: 6.0.4-1
CPU threads: 8; OS: Linux 4.16; UI render: default; VCL: kde4;
Locale: en-US (en_US.UTF-8); Calc: group
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:59.0) Gecko/20100101
You are receiving this mail because:
You are the assignee for the bug.
Libreoffice-bugs mailing list