[Mailman-Developers] Re: Hyperkitty's ability to build a thread

2021-02-13 Thread Stephen J. Turnbull
Mark Sapiro writes:

 > There is an article on threading at
 >  and an RFC
 > . These describe algorithms
 > which are fairly complex, but if someone wanted to try to implement them
 > in HyperKitty, we would certainly consider the implementation.

Ouch.  I didn't realize we didn't use Jamie's algorithm.  It's not
that hard to implement[1], and it's robust and extremely efficient[2],
modulo the cost of accessing message-id, in-reply-to, and references.

A robust, tested, and documented implementation sounds like a GSoC
project to me.  And a PyPI package, though that would be somewhat
harder.

Footnotes: 
[1]  It took me about a day to get it mostly working in Elisp, and
most of the difficulty and the remaining issues were due to working
around bugs in the MUA that caused uncaught exceptions in the MUA.

[2]  It's multipass, but it's worst-case and average-case linear.
Worst-case is linear because the line-length restriction keeps the
length of references down to about 15 at most.
___
Mailman-Developers mailing list -- mailman-developers@python.org
To unsubscribe send an email to mailman-developers-le...@python.org
https://mail.python.org/mailman3/lists/mailman-developers.python.org/
Mailman FAQ: https://wiki.list.org/x/AgA3

Security Policy: https://wiki.list.org/x/QIA9


[Mailman-Developers] Re: Hyperkitty's ability to build a thread

2021-02-13 Thread Thomas Hochstein
Danil Smirnov schrieb:

> I wonder if Hyperkitty is able to leverage some other method to combine the
> thread correctly in this case?

There is no way to display a thread without threading information (in
In-Reply-To: or References: headers). One can try to match by Subject
and/or Date, but that is a heuristic bound to fail.

The "correct" way would be to fix the client that is erroneously [1]
missing or deleting threading headers.

-thh

[1] Violating a SHOULD in RFC 5322, 3.6.4.
___
Mailman-Developers mailing list -- mailman-developers@python.org
To unsubscribe send an email to mailman-developers-le...@python.org
https://mail.python.org/mailman3/lists/mailman-developers.python.org/
Mailman FAQ: https://wiki.list.org/x/AgA3

Security Policy: https://wiki.list.org/x/QIA9


[Mailman-Developers] Re: Hyperkitty's ability to build a thread

2021-02-13 Thread Mark Sapiro
On 2/13/21 4:50 AM, Danil Smirnov wrote:
> Hi everyone and Abhilash in particular :)
> 
> I've faced a case when Hypirkitty is unable to chain messages into a thread:
> https://wlug.mailman3.com/hyperkitty/list/w...@lists.wlug.org/
> 
> (See messages with the subject "WLUG Meeting Feb 11th 2021! Topic: Good
> question!".)
> 
> It's a quite disappointment as GMail does show them correctly - as a
> single thread.
> 
> As per my small investigation, a subscriber Robert N. Evans seems to have
> "In-Reply-To" headers stripped from the messages that probably causes the
> thread to break.


As Steve notes, threading by Subject: matching has its own issues and
HyperKitty makes no attempt to do that.

Where HyperKitty is deficient is it uses only In-Reply-To: and ignores
References:. This is an issue if someone sends a reply to an off-list
message back to the list. In that case, Hyperkitty doesn't find the
In-Reply-To: message-id so starts a new thread, even though there may be
References: message-ids in the archive.


> I wonder if Hyperkitty is able to leverage some other method to combine the
> thread correctly in this case?


There is an article on threading at
 and an RFC
. These describe algorithms
which are fairly complex, but if someone wanted to try to implement them
in HyperKitty, we would certainly consider the implementation.

Note that even HyperKitty's simple method generally works well. It
breaks down when replies to off-list messages go back to the list, when
user's mail clients don't add In-Reply-To: (these are fairly rare), and
when a user composes what is actually a reply as a new message.

Also note that "combine the thread correctly" is a subjective opinion,
at least in some cases.

-- 
Mark Sapiro The highway is for gamblers,
San Francisco Bay Area, Californiabetter use your sense - B. Dylan
___
Mailman-Developers mailing list -- mailman-developers@python.org
To unsubscribe send an email to mailman-developers-le...@python.org
https://mail.python.org/mailman3/lists/mailman-developers.python.org/
Mailman FAQ: https://wiki.list.org/x/AgA3

Security Policy: https://wiki.list.org/x/QIA9


[Mailman-Developers] Hyperkitty's ability to build a thread

2021-02-13 Thread Stephen J. Turnbull
Danil Smirnov writes:

 > As per my small investigation, a subscriber Robert N. Evans seems to have
 > "In-Reply-To" headers stripped from the messages that probably causes the
 > thread to break.
 > 
 > I wonder if Hyperkitty is able to leverage some other method to combine the
 > thread correctly in this case?

It's simply not possible to guarantee correct threading if neither
References nor In-Reply-To are present.

It is possible to place a message in an approximately appropriate
place by threading the threadable messages, grouping that message with
threads with "the same" subject, and inserting it (and any
descendants) after some message with an earlier date, but this is
inherently ambiguous as that message could be a reply to *any* such
message with an earlier date.

This would work well if there is a single linear thread.  But it is
unlikely to work at all well if several posters replied to a single
message in the recent past so that there are multiple subthreads
active at a given time.

Gmail has a big advantage, since they're reading your mail, indexing
it, and creating a fine-grained statistical profile.  That database
can probably be leveraged for better threading.  Or if your posters
consistently top-post, it's probably not too hard to match quoted
content against the top-level content of an earlier post -- if you
have both the development and the computational resources of Google.
(Come to think of it, for Gmail this would probably allow them to
compress their storage by 50%.)  Or maybe they just got lucky.

Steve
___
Mailman-Developers mailing list -- mailman-developers@python.org
To unsubscribe send an email to mailman-developers-le...@python.org
https://mail.python.org/mailman3/lists/mailman-developers.python.org/
Mailman FAQ: https://wiki.list.org/x/AgA3

Security Policy: https://wiki.list.org/x/QIA9


[Mailman-Developers] Re: Hyperkitty's ability to build a thread

2021-02-13 Thread Sam Kuper
On Sat, Feb 13, 2021 at 02:50:47PM +0200, Danil Smirnov wrote:
> Hi everyone and Abhilash in particular :)
> 
> I've faced a case when Hypirkitty is unable to chain messages into a
> thread: https://wlug.mailman3.com/hyperkitty/list/w...@lists.wlug.org/
> 
> [..] It's a quite disappointment as GMail does show them correctly -
> as a single thread.
> 
> As per my small investigation, a subscriber Robert N. Evans seems to
> have "In-Reply-To" headers stripped from the messages that probably
> causes the thread to break.
> 
> I wonder if Hyperkitty is able to leverage some other method to
> combine the thread correctly in this case?

There are two common methods to group messages into threads:

- Using the "In-Reply-To:" header.  (The "correct" approach.  Downside:
  gives false negatives if users strip those headers, as you've seen.)

- Using the "Subject:" header.  (A heuristic approach.  Downside: gives
  false positives if users start a new thread with the same subject as
  an older thread.)

I believe Gmail uses the "Subject:" header.  That would explain why
Gmail was able to recognise Robert N. Evans's messages as part of the
thread even though they lacked the "In-Reply-To:" header.

I don't know if Hyperkitty allows threading using "Subject:" matching,
but if so then that would probably solve your problem.

Sam

-- 
A: When it messes up the order in which people normally read text.
Q: When is top-posting a bad thing?

()  ASCII ribbon campaign. Please avoid HTML emails & proprietary
/\  file formats. (Why? See e.g. https://v.gd/jrmGbS ). Thank you.
___
Mailman-Developers mailing list -- mailman-developers@python.org
To unsubscribe send an email to mailman-developers-le...@python.org
https://mail.python.org/mailman3/lists/mailman-developers.python.org/
Mailman FAQ: https://wiki.list.org/x/AgA3

Security Policy: https://wiki.list.org/x/QIA9


[Mailman-Developers] Hyperkitty's ability to build a thread

2021-02-13 Thread Danil Smirnov
Hi everyone and Abhilash in particular :)

I've faced a case when Hypirkitty is unable to chain messages into a thread:
https://wlug.mailman3.com/hyperkitty/list/w...@lists.wlug.org/

(See messages with the subject "WLUG Meeting Feb 11th 2021! Topic: Good
question!".)

It's a quite disappointment as GMail does show them correctly - as a
single thread.

As per my small investigation, a subscriber Robert N. Evans seems to have
"In-Reply-To" headers stripped from the messages that probably causes the
thread to break.

I wonder if Hyperkitty is able to leverage some other method to combine the
thread correctly in this case?

"Good" and "bad" message examples are in the attachment.

Best regards,
Danil Smirnov
--- Begin Message ---
John Stoffel via WLUG  writes:

> I recall, it's mostly the memory ordering around byte accesses that
> are the problem.  

Are you saying that the Big Endian vs Little Endian flame war
actually had consequences in the Real World?

Or something even more obscure?

   -- Keith
___
WLUG mailing list -- w...@lists.wlug.org
To unsubscribe send an email to wlug-le...@lists.wlug.org
Create Account: https://wlug.mailman3.com/accounts/signup/
Change Settings: https://wlug.mailman3.com/postorius/lists/wlug.lists.wlug.org/
Web Forum/Archive: 
https://wlug.mailman3.com/hyperkitty/list/w...@lists.wlug.org/message/3LQYE4JHDJ7YLZX6EST7S4D4Z3KXGE3T/
--- End Message ---
--- Begin Message ---
That is not what I see when I query one of the major name servers.  I would 
guess your server is configured differently...

rne@P5:~$ dig @1.1.1.1 isc.org

; <<>> DiG 9.16.1-Ubuntu <<>> @1.1.1.1 isc.org
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 31866
;; flags: qr rd ra ad; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
;; QUESTION SECTION:
;isc.org.   IN  A

;; ANSWER SECTION:
isc.org.9   IN  A   149.20.1.66

;; Query time: 24 msec
;; SERVER: 1.1.1.1#53(1.1.1.1)
;; WHEN: Thu Feb 11 10:03:30 EST 2021
;; MSG SIZE  rcvd: 52


-BE

-Original Message-
>From: Keith Wright via WLUG 
>Sent: Feb 11, 2021 1:04 AM
>To: Worcester Linux Users' Group General Discussion 
>Cc: w...@lists.wlug.org, andre.lehov...@gmx.com, Keith Wright 
>
>Subject: [WLUG] Re: WLUG Meeting Feb 11th 2021! Topic: Good question!
>
>Andre Lehovich via WLUG  writes:
>
>>> dig @66.92.74.188 isc.org
>> 
>> Here you go, hope it's useful...
>
>Thank you.  That's a lot of information.
>
>> quetzal:~ al$ dig @66.92.74.188 isc.org
>> 
>> ; <<>> DiG 9.10.6 <<>> @66.92.74.188 isc.org
>> ; (1 server found)
>> ;; global options: +cmd
>> ;; Got answer:
>> ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 11995
>> ;; flags: qr rd; QUERY: 1, ANSWER: 0, AUTHORITY: 13, ADDITIONAL: 27
>> ;; WARNING: recursion requested but not available
>  ^ ^ ^^^ ^^^ ^   
>That looks good.
>I don't want to be doing recursion for you (nothing personal).
>
>But where did all the rest of that come from?
>I've never seen anything like that!
>Did my server send all that?  Why??
>
>> ;; OPT PSEUDOSECTION:
>> ; EDNS: version: 0, flags:; udp: 4096
>> ;; QUESTION SECTION:
>> ;isc.org.INA
>> 
>> ;; AUTHORITY SECTION:
>> .348191INNSc.root-servers.net.
>> .348191INNSd.root-servers.net.
>> .348191INNSe.root-servers.net.
>> .348191INNSf.root-servers.net.
>> .348191INNSg.root-servers.net.
>> .348191INNSh.root-servers.net.
>> .348191INNSi.root-servers.net.
>> .348191INNSj.root-servers.net.
>> .348191INNSk.root-servers.net.
>> .348191INNSl.root-servers.net.
>> .348191INNSm.root-servers.net.
>> .348191INNSa.root-servers.net.
>> .348191INNSb.root-servers.net.
>> 
>> ;; ADDITIONAL SECTION:
>> a.root-servers.net.348191INA198.41.0.4
>> a.root-servers.net.348191IN2001:503:ba3e::2:30
>> b.root-servers.net.348191INA199.9.14.201
>> b.root-servers.net.348191IN2001:500:200::b
>> c.root-servers.net.348191INA192.33.4.12
>> c.root-servers.net.348191IN2001:500:2::c
>> d.root-servers.net.348191INA199.7.91.13
>> d.root-servers.net.348191IN2001:500:2d::d
>> e.root-servers.net.348191INA192.203.230.10
>> e.root-servers.net.348191IN2001:500:a8::e
>> f.root-servers.net.348191INA192.5.5.241
>> f.root-servers.net.348191IN2001:500:2f::f
>> g.root-servers.net.348191INA192.112.36.4
>>