[EMAIL PROTECTED] posted [EMAIL PROTECTED], excerpted below, on Tue, 09 Dec 2008 13:41:28 +0100:
>> But what you're seeing is normal. Keep in mind that if pan is saying a >> million articles, that's after combining multiparts. In some groups, >> that could mean ten or fifty million actual single-part articles. > > I was referring to the total article count (not the thread count) My > largest groups file is 900 MB. But how are you counting articles? Are you using the pan unread count or something else? If you're using pan unread count, it's counting multiple parts (not threads, parts of a single multipart message, aka multi- segment) as a single entry. If you look you can see it, 15/15 or whatever, or sometimes a missing part, 14/15, with the corresponding broken puzzle icon instead of the full puzzle icon. With old-pan you could separate the parts into the individual pieces, which then showed up as "threads". With new-pan, it's displayed only once, tho actual replies are still shown threaded. > Perhaps using memory maps might speed up things ? Also the data seems to > be writting in ASCII format, requiring rescan/repars every time. > Perhaps saving in binary, which allows even more efficient use of memory > maps might be usefull (Option only for large groups perhaps ? ) It might > not reduce the size of the file but it will avoid having to convert lots > of integers (like line numbers, sizes, dates etc). Also it would allow > to read in blocks without having to process those blocks. Perhaps as an option. Note that binary is a much more opaque format, much harder to repair manually if necessary. > that is true but when you know you might need to treat 1G of data, you > start managing the data cleaverly. Generally you try to save the work > that you did for later purposes. E.g. if you have already figured out > certain things, you store that info so that you don't have to figure it > out later on. As I said, pan now does save its work. Old-pan used to re-thread every time you loaded the group. >> Meanwhile, how do you monitor CPU usage? Are you monitoring it per >> core, or overall only? Most of new-pan is single-threaded, because >> Charles had gone with multi-threaded in old-pan and found the >> complexity and thread- race bugs just not worth it for the limited >> increase in performance. Instead, new-pan now hatches threads only in >> limited performance critical sections (like when starting multiple >> connections at once, one place I know it's used as I remember Charles >> fixing a bug I had with it). So pan will likely be using near 100% of >> a single core, but the others should remain mostly idle, I /think/. >> (It has been awhile since I did binaries and IDR for sure.) > > No when it is busy doing stuff and blocking other apps from doing > something I ran top and it showed pan using about 80% cpu, constantly > for a certain time. Yes, but what are your top settings? Are you showing each individual core separately or are you only showing the combined, and are you using irix or solaris mode? I'm asking because depending on setting, using all four at 100% each it could call that 400% or 100%, with 100% of a single core being correspondingly 100% or 25%, all depending on how you have top set. You're using Kubuntu so you should be able to setup a ksysguard graph if you like. I don't know if you're on KDE 3.5 or 4.x but 4.x is still broken for daily use for me (4.2 should fix most of it AFAIK), so I'm using 3.5.10 still, with a ksysguard kicker applet at the top of my screen. Its first four graphs are user/system/nice CPU on each of the four cores, so when I'm in KDE 3 anyway, I get a live updating graph of activity on each of the four cores. (FWIW, next is load, then memory, then swap which is normally zero, then up and down network traffic, then multiplexed disk activity, then the four CPU core-temps, then two additional system temps. I'm running two 1920x1200 LCDs stacked for 1920x2400, with the ksysguard applet taking up nearly 1500 px width at the max 300 px kicker panel height, on the top LCD.) >> Also, it may be disk I/O related, if you have a single disk only and >> that group's data isn't in cache yet. I run a dual dual-core Opteron >> 290 (2.8 GHz) here, so have four cores too, but I'm running >> Gentoo/~amd64 with everything compiled to my specific hardware, which >> will help some (BTW, you didn't mention whether you were running 32-bit >> or 64-bit kubuntu, 4 gigs on 32-bit is going to be less efficient than >> 4 gigs on 64-bit), and I run a 4-disk kernel/md RAID, with pan's data >> on RAID-6, which means it's two-way striped. RAID striping really >> /does/ help, and not just with pan; you might be surprised how much. > > Yes i have been considering switching since > > 1. my 4 GB is not used (because of memory of graphics card) FWIW you should be able to configure the 32-bit kernel for 64-gig mode if you like, or probably download one so configured from Ubuntu (possibly named 686-bigmem, the Debian name AFAIK). If the BIOS will remap the memory, you should then get the memory ordinarily covered by the legacy 32-bit PCI I/O hole (typically half a gig or so, sometimes a full gig) mapped above the 4-gig boundary then. This works using PAE mode, AFAIK. Here's a bit about it. The title says a gig, but it talks about both the HIGHMEM-4G and HIGHMEM-64G options. http://www.linux.com/feature/119287 But in 32-bit mode that's less efficient as it has to effectively page the memory into a window it can actually address. 64-bit of course eliminates that. And... not all BIOS support it, 32-bit or 64-bit, unfortunately, altho most of the newer ones will in 64-bit at least. > 2. indeed my disk seems to be the bottleneck. > However I need to completely upgrade my box and that is a hard job. > Also I have no experience setting up RAID (donno even if my mobo > supports it) FWIW, I had no experience with it either, until I had two drives go out in two years and decided I needed a bit more reliability than that. So I upgraded to 4xSATA drives (my mobo supported it in firmware RAID mode but that sucks in Linux since it's really software RAID anyway, and proper kernel RAID is more reliable, so I set it to straight SATA mode and used the kernel RAID) and after some research and planning it all out, set it up. If you decide to do it and know nothing of RAID, you'll want to google for the free chapter of O'Reilly's Linux RAID book. It's an excellent intro, explaining the difference between hardware, firmware and kernel/md RAID, and the various RAID levels. That's where I started as I knew very little about it before that. After that, you'll want to read the kernel's md.txt document (in your kernel Documentation subdir) and look at the HOWTOs. In particular, keep in mind that if you're going to boot off the RAID, you need a small RAID-1/mirrored partition to hold /boot, since RAID-1 is all either LILO or GRUB understand. When I setup, there wasn't a lot of info out there about mdp/partitioned-RAID yet, as it was still pretty new, but I managed to find what I needed. You will also want to consider LVM2 on top of RAID, the way people handled it before partitioned RAID, but while you can boot directly to RAID using an appropriate kernel command line, unfortunately LVM2 requires an initramfs/initrd. I chose not to use that, so I put my root filesystem and a backup on partitioned RAID, and almost everything else I wanted to keep redundant on an LVM on top of RAID, setup so I could load LVM after I had my rootfs on the partitioned RAID already going. That's 10km high overview at a few hundred km/hr! =:^) It sounds confusing condensed like that, but take it a step at a time as I did, and you should be fine, as I was/am. =:^) If you get stuck, you know someone to mail for help. =:^) Not to pressure you if you don't believe your ready, but really, if you're already running quad-core and 4 gig RAM, a single spindle hard drive IS the bottleneck, and you'll find the system not only faster, but much more responsive, once you effectively get that millstone of your neck. I just think it's such a shame to have a nice system bogged down like that. -- Duncan - List replies preferred. No HTML msgs. "Every nonfree program has a lord, a master -- and if you use the program, he is your master." Richard Stallman _______________________________________________ Pan-users mailing list [email protected] http://lists.nongnu.org/mailman/listinfo/pan-users
