[Libreoffice-bugs] [Bug 155384] Fixed width paste does not sample enough input data

bugzilla-daemon Sat, 20 May 2023 06:48:17 -0700

https://bugs.documentfoundation.org/show_bug.cgi?id=155384


--- Comment #14 from Pierre Fortin <[email protected]> ---
(In reply to ady from comment #13)

> The problem is that the last column's width is limited in the dialog and it
> seems it cannot be marked in any way in order for it to be wider. (At least
> the data is not completely lost.)

Not lost; but certainly not useful... and destructive if I've only allotted 2
columns and the data is split into 3, thereby overwriting some of the data in
the right-side adjacent column...  at least there's Ctrl+z 

While "natural sort" can be used on this data, arithmetic operations cannot
unless the original column is split into prefix & gap'ed sequences.

> Having said that, care should be taken if solving this issue, I could
> imagine some case in which someone complains in the opposite direction (e.g.
> the last column is too wide and contains unwanted characters, not seen in
> the dialog).

Which "unwanted characters" would you foresee being added?  I've always held
the position: "I can't insert data that doesn't exist"...  The last column
should only contain what is present, without "padding".  Or did you mean
something else?

> It seems as if somehow some lookup is performed in order to hint about some
> columns' widths; when solving this problem, care should be taken not to
> delay the dialog too much (for instance if there are many columns and many
> rows).

This is a "fixed width" issue; not "separated by". In fact, I often separate
columns with "separated by [comma]"... Since the user has selected how to split
the first N characters, Calc does not need to concern itself with anything but
the last X characters to determine the max width.  
A Python example:
  maxXwidth = 0  # last expected column
  for row in rows:
     maxXwidth = max( maxXwidth, len( row[last_split_point:] ) )

As far as "not to delay", this is a column, not sheet issue; so doing an
in-memory scan should be barely noticeable if at all. To wit, anyone who uses
AutoFilter is subliminally aware of any likely delay dealing with a column.

> I am able to overcome the case by using the "Separated by" alternative when
> importing the first time the csv (no need to copy after the initial import),
> but, again, this is because the relevant column is not the _last_ one.

Sorry; but what I provided was a sample of the issue. The real "initial import"
contains 90+ varying width columns, delimited by tab, comma, or other. There's
no way to use fixed width on that initial data without providing a dialog which
allows "fixed width" splitting on certain columns within a "separated by"
scenario. e.g., initial data is imported with "separated by" along with certain
columns further split via a second level "fixed width" or "separated by" (see
below).

> So, clearly, there is some problem when importing, with the "last" column's
> width.

Thinking about the problem more logically...  aside from the incorrect final
split; this may be more of a feature request:  the need is to split a single
column. In this case "fixed width"; but I also need to split a single columns
with "separated by", such as an address column into street, city, zip...   Or a
phone number column into area_code, NNX, number (separated by '-').  This is
further complicated for me because I have to deal with international data for
which there is no standard when multiple countries' data is provided within a
single column.  ;p

-- 
You are receiving this mail because:
You are the assignee for the bug.

[Libreoffice-bugs] [Bug 155384] Fixed width paste does not sample enough input data

Reply via email to