Re: [libreoffice-users] Pasting Tabular Data Into Calc From An HTML Source

2016-02-08 Thread Paul D. Mirowsky

Have you tried *Tools - Options - Load/Save - HTML Compatibility
*See: https://help.libreoffice.org/Common/About_Import_and_Export_Filters



On 2/6/2016 11:21 AM, James E. Lang wrote:

Thank you, Piet.

So do I need to install LO 3.6.7.2 in order to correct this? Or, should I 
implement the following twelve steps?

1) use my eMail client to copy these frequent reports to my computer from Gmail

2) run a script or program to:

2.1) open that file

2.2) strip out everything before 

2.3) unpack the Quoted-Printable encoding (preserving the character set 
specification somehow)

2.4) save the file

3) switch to Calc

3.1) insert a temporary sheet from the file

3.2) select all data from that sheet

3.3) switch to the sheet where the data belongs

3.4) paste in the appropriate location

3.5) delete the temporary sheet

I think all those steps are doable. Per your suggestions, step 3.1 could be "open the HTML file in 
LO," step 3.3 would be "switch to the main document," and step 3.5 would be "close 
the source (temporary) spreadsheet without saving it. As I see it step 2.3 is the most complex.




--
To unsubscribe e-mail to: users+unsubscr...@global.libreoffice.org
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/users/
All messages sent to this list will be publicly archived and cannot be deleted


Re: [libreoffice-users] Pasting Tabular Data Into Calc From An HTML Source

2016-02-06 Thread James E. Lang
Thank you, Piet. 

So do I need to install LO 3.6.7.2 in order to correct this? Or, should I 
implement the following twelve steps?

1) use my eMail client to copy these frequent reports to my computer from Gmail

2) run a script or program to:

2.1) open that file

2.2) strip out everything before 

2.3) unpack the Quoted-Printable encoding (preserving the character set 
specification somehow)

2.4) save the file

3) switch to Calc

3.1) insert a temporary sheet from the file

3.2) select all data from that sheet

3.3) switch to the sheet where the data belongs

3.4) paste in the appropriate location

3.5) delete the temporary sheet

I think all those steps are doable. Per your suggestions, step 3.1 could be 
"open the HTML file in LO," step 3.3 would be "switch to the main document," 
and step 3.5 would be "close the source (temporary) spreadsheet without saving 
it. As I see it step 2.3 is the most complex.

-- 
Jim

-Original Message-
From: Piet van Oostrum 
To: "James E. Lang" 
Cc: users@global.libreoffice.org
Sent: Fri, 05 Feb 2016 2:39
Subject: Re: [libreoffice-users] Pasting Tabular Data Into Calc From An HTML 
Source

James E. Lang wrote:

 > LO 4.4.7.2
 > 
 > I receive reports a regular basis from a pizza delivery driver who
 > sends them from his iPad. A raw data dump of the HTML part of a
 > sample report with customer identifying information replaced by
 > "Address 1" and "Address 2" will be pasted below my signature.
 > 
 > Receiving this report via Pegasus Mail I have two potentially
 > useful ways to paste the content of these reports into LO Calc
 > (HTML and RTF). Each of these has a issue though I can work with
 > the RTF method far more easily than the HTML method.
 > 
 > My problem with the HTML method (which, by the way, also exists
 > with tables of data copied directly from my power company's web
 > site) is that the header row () contains a colspan attribute
 > that is misapplied by LO. In this  row the first (and only)
 > field is not merged across any columns to its right but it should
 > be merged across 14 additional columns. In the first row below that
 > the first field is not (nor should it be) merged at all but in the
 > next four rows the first field is erroneously merged progressively
 > over 15, 29, 43, and 57 columns. The first (and again only) field
 > in the second table's  row is merged across 71 columns instead
 > of 11. The first field of next row is merged across the same 71
 > columns though it should not be merged at all and on the remaining
 > two rows the first field is erroneously merged across 81 and then
 > 91 columns.
 > 
 > Has anyone else experienced this anomaly?
 
This is a known bug: https://bugs.documentfoundation.org/show_bug.cgi?id=74577
 
It seems this bug is more or less forgotten, as it is a regression, it has been 
identified when the bug was introduced, but there has been no activity for more 
than one year.

By the way, I noticed some anomalies in your HTML code. When I opened it in 
Firefox, two fields of the tables had moved out of the tables. It appears there 
are some Non-Breaking-Spaces in it which cause this. After replacing these with 
normal spaces, the tables appear as they should.

Secondly, when I open the HTML file with LibreOffice (version  5.1 RC 3) it 
opens as a spreadsheet with the correct layout, except the background colors. 
Also possible with Sheet > Insert Sheet From File.
So this might be a workaround for you.
-- 
Piet van Oostrum 
WWW: http://pietvanoostrum.com/
PGP key: [8DAE142BE17999C4]


-- 
To unsubscribe e-mail to: users+unsubscr...@global.libreoffice.org
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/users/
All messages sent to this list will be publicly archived and cannot be deleted


[libreoffice-users] Pasting Tabular Data Into Calc From An HTML Source

2016-02-04 Thread James E. Lang
LO 4.4.7.2

I receive reports a regular basis from a pizza delivery driver who sends them 
from his iPad. A 
raw data dump of the HTML part of a sample report with customer identifying 
information 
replaced by "Address 1" and "Address 2" will be pasted below my signature.

Receiving this report via Pegasus Mail I have two potentially useful ways to 
paste the content 
of these reports into LO Calc (HTML and RTF). Each of these has a issue though 
I can work 
with the RTF method far more easily than the HTML method.

My problem with the HTML method (which, by the way, also exists with tables of 
data copied 
directly from my power company's web site) is that the header row () 
contains a colspan 
attribute that is misapplied by LO. In this  row the first (and only) field 
is not merged 
across any columns to its right but it should be merged across 14 additional 
columns. In the 
first row below that the first field is not (nor should it be) merged at all 
but in the next four 
rows the first field is erroneously merged progressively over 15, 29, 43, and 
57 columns. The 
first (and again only) field in the second table's  row is merged across 71 
columns 
instead of 11. The first field of next row is merged across the same 71 columns 
though it 
should not be merged at all and on the remaining two rows the first field is 
erroneously 
merged across 81 and then 91 columns.

Has anyone else experienced this anomaly? 

I don't know how LO determines what is a numeric field amd what is a textial 
field when 
pasting from the clipboard by my problem with the RTF method is LO gets it 
backward. Cells 
that received numeric fields get formatted as Category Text with Format @ and 
those that 
receive textual fields get formatted as Category Number with Format General. 
This means 
that if I subsequently enter a formula [e.g. =SUM(A5:A15)] where a numeric 
field (e.g. 5 or 
$7.52) had been that formula gets interpreted as text instead of being 
interpreted as a 
formula. The values in A5:A15 (which were also pasted from an HTML source) are 
zero since 
they are formatted as text so the formula can not work anyway. This is not 
theory. I stumbled 
upon this when i tried to shift some columns without retaining their former 
location info (Copy, 
Del, Pas
Can anyone else verify this?

Are there any bugzilla reports regarding any of these issues? Into how many 
bugzilla reports 
should I break this down? [HTML -- 1? or 2?. RTF -- 1? or 2? Inability to enter 
a formula that 
is recognized as such into a Category Text Format @ cell -- ?] I see the 
possibility of possibly 
as many as five separate bugzilla reports but I don't look forward to 
submitting them.

--
Jim

=== Start of the source HTML data =
Pizza Deliver=
iesCHKTicketWhereWhenLastPriceCashCard=E2=88=86 paypayreconcilia=
tiontipincome=
OrdersNote812718Address 1=
 2/3/16, 8:54 PM1$54.4=
7$59.47$5.74-$10.74$5.00$10.741<=
td nowrap=3D""><=
/td>812719Address 2 2/3/16, 8:55 PM1$42.98$49.98<=
br>$4.67-$11.67$7.00$11.67=
1812719Address 2=
 2/3/16, 8:56 PM1-$0.67$0.00$0.00$0.00$0.000second trip to excha=
nge salad dressings0/3$97.45$0.00$109.45-$0.67$10.41-$22.41$12.00$22.412=
Other TransactionsCHKTicketWhere<=
th nowrap=3D"">WhenLastPrice<=
th nowrap=3D"">CashCardpay=E2=88=86 payNote812719Address 2 2/3/16, 8:56 PM1=
$0.00=
-$0.670/1$0.00$0.00$0.00=
$0.00-$0.67Sent from my iPad=
 
=== End of pasted raw view of the second part =
  
--- End of forwarded message ---


-- 
To unsubscribe e-mail to: users+unsubscr...@global.libreoffice.org
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/users/
All messages sent to this list will be publicly archived and cannot be deleted