Bug#1020215: w3m interprets when not in but in

2023-01-06 Thread Tatsuya Kinoshita
On 2023-01-06 at 19:31 +0100, Robert Alm Nilsson wrote:
> After using my version of w3m every day since I posted this, the only
> problem I have noticed on a few pages is that it doesn't handle titles
> directly under html tags, so this doesn't work:
>
> Hello

Ah, it should be interpreted as in .

I've installed a patch that only picks a first title as a workaround.

  - https://github.com/tats/w3m/commit/b9c022a8101e6325468bf310d6038df11e5fac67

Thanks for your report.

--
Tatsuya Kinoshita


pgpvdGS3k2laz.pgp
Description: PGP signature


Bug#1020215: w3m interprets when not in but in

2023-01-06 Thread Robert Alm Nilsson
After using my version of w3m every day since I posted this, the only
problem I have noticed on a few pages is that it doesn't handle titles
directly under html tags, so this doesn't work:

Hello

I don't know if we want to care about that case and I don't think it's
valid HTML but I wanted to let you know.



Bug#1020215: w3m interprets when not in but in

2023-01-06 Thread Tatsuya Kinoshita
Control: tags -1 + patch fixed-upstream pending

On 2023-01-06 at 15:26 +0100, Rene Kita wrote:
> Reviewed and tested by me.
> Tats, could you please include it in your planned snapshot tarball?

Merged, thanks.

--
Tatsuya Kinoshita


pgpAWqia7oxxQ.pgp
Description: PGP signature


Bug#1020215: w3m interprets when not in but in

2022-09-18 Thread Robert Alm Nilsson
Package: w3m
Version: 0.5.3+git20220429-1+b1

When opening a page in w3m that contains an  element that contains
a  tag, w3m will use that svg title as the page title.

An example of a page to demonstrate this problem is github.com.  The
real title of github.com is "GitHub: Where the world builds software ยท
GitHub" but w3m displays the title "Go" (or the name of any other
language) because the front page contains svg images for different
programming languages with title tags.

Here is a patch I made.  I don't know if this is the most optimal way to
solve this but it seems to work in practice and I will use it locally
until there in an upstream fix.

commit f41db326e73fde685c1d0b79e46beec56336995e
Author: Robert Alm Nilsson 
Date:   Sun Sep 18 09:51:29 2022 +0200

Only read title when in head

Before this change, it was possible that w3m would interpret a title tag
under e.g. an svg tag as the page title.

diff --git a/file.c b/file.c
index 9704cea..8e8b280 100644
--- a/file.c
+++ b/file.c
@@ -4816,19 +4816,23 @@ HTMLtagproc1(struct parsed_tag *tag, struct 
html_feed_environ *h_env)
/* obuf->flag |= RB_IGNORE_P; */
return 1;
 case HTML_TITLE:
-   close_anchor(h_env, obuf);
-   process_title(tag);
-   obuf->flag |= RB_TITLE;
-   obuf->end_tag = HTML_N_TITLE;
+   if (obuf->flag & RB_HEAD) {
+   close_anchor(h_env, obuf);
+   process_title(tag);
+   obuf->flag |= RB_TITLE;
+   obuf->end_tag = HTML_N_TITLE;
+   }
return 1;
 case HTML_N_TITLE:
-   if (!(obuf->flag & RB_TITLE))
-   return 1;
-   obuf->flag &= ~RB_TITLE;
-   obuf->end_tag = 0;
-   tmp = process_n_title(tag);
-   if (tmp)
-   HTMLlineproc1(tmp->ptr, h_env);
+   if (obuf->flag | RB_HEAD) {
+   if (!(obuf->flag & RB_TITLE))
+   return 1;
+   obuf->flag &= ~RB_TITLE;
+   obuf->end_tag = 0;
+   tmp = process_n_title(tag);
+   if (tmp)
+   HTMLlineproc1(tmp->ptr, h_env);
+   }
return 1;
 case HTML_TITLE_ALT:
if (parsedtag_get_value(tag, ATTR_TITLE, ))
@@ -5523,9 +5527,13 @@ HTMLtagproc1(struct parsed_tag *tag, struct 
html_feed_environ *h_env)
}
}
 case HTML_N_HEAD:
+   obuf->flag &= ~RB_HEAD;
if (obuf->flag & RB_TITLE)
HTMLlineproc1("", h_env);
+   return 1;
 case HTML_HEAD:
+   obuf->flag |= RB_HEAD;
+   return 1;
 case HTML_N_BODY:
return 1;
 default:
diff --git a/fm.h b/fm.h
index 25857f8..9e12b42 100644
--- a/fm.h
+++ b/fm.h
@@ -675,6 +675,7 @@ struct readbuffer {
 #define RB_DEL 0x10
 #define RB_S   0x20
 #define RB_HTML5   0x40
+#define RB_HEAD0x80
 
 #define RB_GET_ALIGN(obuf) ((obuf)->flag_ALIGN)
 #define RB_SET_ALIGN(obuf,align) do{(obuf)->flag &= ~RB_ALIGN; (obuf)->flag |= 
(align); }while(0)