memilanuk posted on Sat, 04 May 2013 19:00:23 +0000 as excerpted: > Most of the time this seems to work... but in groups/lists with lower > traffic I started noticing that the number of new articles shown in > parentheses next to the group name doesn't always match the sum of the > number of unread posts shown in parentheses next to each collapsed > thread.
> Since I have no scoring rules in effect at this point, I don't think > that there are new posts not being shown because of that. > > How can I get Pan to show new posts in both threads and sub-threads at > the same time, without having to go and manually toggle the settings > back and forth? Thanks for specifying that you don't have any scoring in effect. That was my first thought. There's three additional factors that apply. I'd guess one or more is/ are what you are seeing here. First, it's worth noting that in collapsed threads, the unread count is the count of NOT SHOWN unread posts. If the visible post is ITSELF unread, it won't be included in that number. Thus, for example if an entire thread with 10 posts is unread, the number in parenthesis beside the initial post will be (9), because the initial post itself is shown -- there's 9 /hidden/ posts marked unread that will be shown if the thread is expanded. That throws me off occasionally as well, because I intuitively expect it to be the total unread count for the (sub)thread, including the displayed header if that message too is unread. Second, note that for multi-part posts, generally of large binary attachments, pan transparently combines them into one for display. So a message with 100 individual parts to combine to get the attachments, will display as a single post.[1] Before pan actually fetches the group overview file, all it sees is that the highest post sequence number for that group is N higher than it was the previous time, and it simply does the math to deduce how many posts there are between the two. But once pan fetches the overview file, it can do this combining, which will reduce the number of unread posts, sometimes significantly, particularly for binary groups. But that mostly applies to binary groups, where multi-part posts are common. In text groups these don't appear so often, so this is unlikely to explain discrepancies there. Which brings me to number three, the most common reason (once #1 is accounted for) for such discrepancies in low-traffic text groups, where they're most likely to be noticed. As mentioned in passing above, put in simple terms the way news works is that when a client first asks about a newsgroup, it gets a reply that gives some information about the group and the articles the server has for that group. This information includes the post sequence numbers for the earliest available post, the low water mark, and the newest available post, the high water mark. For subscribed groups, a news client typically remembers the high water mark from the last time it connected, and can then do the math to deduce the POSSIBLE number of unread articles available. (Of course, if its the first time you visited the group or if you haven't visited in awhile and everything from before is already expired, so the low water mark is higher than the remembered high water mark, the available posts are only those between the earliest and the latest that the server has.) *BUT*, for various reasons, while the post sequence numbers are indeed in sequence, ever increasing, a server DOES NOT NECESSARILY have ALL the posts in a particular range available. A user may have canceled or superseded a post, for instance, or the server may apply spam filters after assigning sequence numbers, deleting articles from the middle of the sequence. It's actually quite common for larger news service providers to have dedicated incoming post machines that assign the sequence numbers, before forwarding the posts to filter machines that filter out the spam, thus creating holes in the sequence, which then forward on to the front-ends that the users and their news clients actually connect to. In such a setup, it's also quite possible for the posts to appear at the front-ends in out-of-sequence order, such that say # 43561 and 43562 appear before # 43553-43560. Then # 43557 and 43559 are filtered as spam, and # 43554 is canceled by the original poster, leaving posts numbered 43553,43555-43556,43558 and 43560, appearing after # 43562, which all the while is still the high-water-mark. And # 43554 might appear, then disappear when the cancel gets processed. When pan first checks that group, it'll see # 43562 as the high-water- mark and do the math, displaying the resulting unread count. Then when it fetches the actual overview file, pan learns what messages are actually available. The number of unread messages can then drop, or possibly go up if there's more backfills then new message number gaps. Then later, when you actually try to fetch those messages, messages in the overview may now be unavailable. Old ones may have expired, or maybe someone canceled a message, or perhaps a scanner detected copyrighted content and issued a takedown, or futher spam-filtering was applied. Thus in general the unread count can only be a rough estimate, found initially by simply doing the math between new highwater count and either old highwater count or new lowwater count, as appropriate. (Actually, the server passes its estimate of the number of messages between its lowwater and highwater counts as well, but that's an estimate as well. The standard requires that there be no MORE messages than that available, but it specifically allows there to be less, and many servers simply do the same math to arrive at their estimate, even tho in theory they have enough information available to pass a more accurate count if they wanted to.) --- [1] The terminology here isn't fully standardized and can be confusing. An extremely large binary file like an ISO image is often pre-split into multiple files before posting, with each of these pre-split parts posted separately. The individually posted parts are then often automatically split by the posting software, with each individual message post only containing an incomplete attachment that must be combined with the others in ordered to retrieve the pre-split file part. It's this automatic splitting that pan detects and displays as a single post, not the pre- split parts. Of course complicating things is the fact that multi-part can also refer to the reverse, a single article containing multiple parts, say a plain-text message, the HTML form of the same message, and various attachments for the images linked into the HTML message, all appearing as part of the same single posted message. -- Duncan - List replies preferred. No HTML msgs. "Every nonfree program has a lord, a master -- and if you use the program, he is your master." Richard Stallman _______________________________________________ Pan-users mailing list Pan-users@nongnu.org https://lists.nongnu.org/mailman/listinfo/pan-users