https://bugs.documentfoundation.org/show_bug.cgi?id=152337

            Bug ID: 152337
           Summary: Show a warning infobar when imported text file used
                    several of selected field separators
           Product: LibreOffice
           Version: unspecified
          Hardware: All
                OS: All
            Status: UNCONFIRMED
          Keywords: needsUXEval
          Severity: enhancement
          Priority: medium
         Component: Calc
          Assignee: [email protected]
          Reporter: [email protected]
                CC: [email protected]
            Blocks: 109238

When opening a text (CSV, TSV, ...) file in Calc, the import dialog allows to
select several field separators, and three of those (tab/comma/semicolon) are
selected by default. This allows to have some sort of "autodetection" for
simple cases. It also allows importing less structured data, actually having
different separators simultaneously.

In the latter case, seeing several of the selected separators in the file is
expected and normal. However, there are more usual cases, when the data was
actual CSV/TSV and the like, and in fact only had one separator, but some
unquoted textual data in it could also include other characters that happen to
be among selected separators. In such cases, the user relying on some automagic
could not notice that their large body of data had imported wrong, some fields
split on these false separators.

E.g., a CSV (only commas actually used for field separation) could have
semicolon in a field:

a,b,c
content of field a with semicolon ; - but still one field!,field b,field c

Such a CSV, when imported with default settings in the import dialog (so tabs,
colons, and semicolons are checked), would split second row into *four* cells,
which is not what the user would expect:

content of field a with semicolon 
 - but still one field!
field b
field c

If that happens somewhere in the middle of a 100 000-row data, it may be easily
overlooked. The user could work on the data, edit it, save, and not notice that
some data got corrupt. After that, it is impossible to easily find and undo the
corruption.

The idea is to show an infobar in such a case, informing that Calc saw several
of the selected field separators in the imported file, and hinting that
*possibly* it should be inspected, and re-imported, only selecting the actual
separator used in the file. It would be a false detection in the "less
structured data" case discussed in the first paragraph above, but likely a
minor annoyance, with relatively low impact, compared to current potential for
unnoticed data corruption in more common scenario.


Referenced Bugs:

https://bugs.documentfoundation.org/show_bug.cgi?id=109238
[Bug 109238] [META] CSV bugs and enhancements
-- 
You are receiving this mail because:
You are the assignee for the bug.

Reply via email to