On Wed, Nov 1, 2017 at 1:57 AM, Rob Landley <[email protected]> wrote: > On 10/19/2017 06:13 PM, enh wrote >>> On 09/20/2017 05:08 PM, enh wrote: >>>> ps -T doesn't really work if you have any filters. so ps -AT is fine, >>>> but ps -p <chrome pid> -T only shows the main thread. >>> >>> Alas, I don't personally use threads much so basically never test this. >>> >>>> why? because slots[SLOT_pid] is "wrong" in shared_match_process (where >>>> by wrong i mean "is the tid"). >>>> >>>> why? because toybox reads (say) /proc/147047/task/147058/stat and sees >>>> >>>> 147058 (CompositorTileW) S 31782 6249 6249 0 -1 1077952576 4 0 0 0 0 0 >>>> 0 0 20 0 11 0 1211910244 928649216 35602 18446744073709551615 >>>> 94558515900416 94558627572512 140720560858928 140506510343072 >>>> 140506833892356 0 0 4098 1073827581 1 0 0 -1 31 0 0 0 0 0 >>>> 94558627579744 94558633602584 94558666661888 140720560866826 >>>> 140720560866928 140720560866928 140720560869342 0 >>>> >>>> and copies 147058 into SLOT_pid because that code no longer knows the real >>>> pid. > > I added -H to iotop, fixed the off-by-one error in screen width > truncation, set the screen width to 72, and ran "top -H -O TID,SHR" and > cursored over to the SHR column and: > > PID USER TID [SHR]%CPU %MEM TIME+ THREAD PROCE > 1865 landley 1865 84M 98.1 14.8 128:47.94 thunderbird thun+ > 1865 landley 1895 84M 1.8 14.8 41:53.81 SoftwareVsyncTh thun+ > 1865 landley 723 84M 0.0 14.8 0:00.00 StreamT~s #3048 thun+ > 1865 landley 30482 84M 0.0 14.8 2:14.82 DOM Worker thun+ > 1865 landley 9488 84M 0.0 14.8 7:16.20 DOM Worker thun+ > 1865 landley 16082 84M 0.0 14.8 15:33.85 DOM Worker thun+ > 1865 landley 15523 84M 0.0 14.8 0:00.49 DOM Worker thun+ > 1865 landley 12086 84M 0.0 14.8 17:24.69 DOM Worker thun+ > 1865 landley 3838 84M 0.0 14.8 18:14.68 DOM Worker thun+ > 1865 landley 2254 84M 0.0 14.8 18:26.28 DOM Worker thun+ > 1865 landley 30746 84M 0.0 14.8 18:30.12 DOM Worker thun+ > > An they all seem to think they're PID 1865, but each knows its TID?
yep, that works fine. > $ ls /proc/1865/task/ > 10228 1865 1875 1880 1887 1893 1899 1911 1934 30482 5471 > 12086 1871 1876 1881 1888 1894 1903 1912 19926 30746 6316 > 14369 1872 1877 1882 1889 1895 1904 1921 22017 32148 9488 > 15523 1873 1878 1885 1891 1897 1906 1927 2254 3838 9955 > 16082 1874 1879 1886 1892 1898 1908 1933 2770 5469 > > Which seems reasonable? It's got PID, it's got TID, what do I need to > fix here? the bug i reported... :-) i think you read the bug backwards because i included a case that *does* work... > (Aside: thunderbird really, really, really doesn't like a local > linux-kernel folder with 500k messages in it. Or a pop3 inbox going back > to 2013, which is the last time I split it. But I think what it's mad > about right now is the BUG() I hit in the vfat code which I had to fsck > away to do some work, and thus the vfat maintainer couldn't reproduce > it. I've meant to reboot ever since, it happened about 5 times before I > fixed it, emergency-zapping a filesystem each time, and the memory > menagement on this box has gone all wonky since then. I've been meaning > to reboot to replace the keyboard anyway, but 8 desktops full of windows > full of tabs takes a while to unwind...) > >>>> not sure how best to fix this. >>> >>> Hmmm... Reasonably straightforward to fix, > > Not necessarily straightfowrward to reproduce. > > $ ps -AT > PID TID > 32667 32667 ? 00:26:22 chromium-browse > 32667 32668 ? 00:00:00 TaskSchedulerSe > 32667 32669 ? 00:00:20 Chrome_ChildIOT > 32667 32670 ? 00:00:00 GpuMemoryThread ...specifically the ps -AT case... ...and a case that *doesn't* work, which is when there's a filter *instead of* -A. i'll just repeat the original because i'm too lazy to rewrite it and because i think the problem is just that you were just too tired when you looked at it :-) ps -T doesn't really work if you have any filters. so ps -AT is fine, but ps -p <chrome pid> -T only shows the main thread. why? because slots[SLOT_pid] is "wrong" in shared_match_process (where by wrong i mean "is the tid"). why? because toybox reads (say) /proc/147047/task/147058/stat and sees 147058 (CompositorTileW) S 31782 6249 6249 0 -1 1077952576 4 0 0 0 0 0 0 0 20 0 11 0 1211910244 928649216 35602 18446744073709551615 94558515900416 94558627572512 140720560858928 140506510343072 140506833892356 0 0 4098 1073827581 1 0 0 -1 31 0 0 0 0 0 94558627579744 94558633602584 94558666661888 140720560866826 140720560866928 140720560866928 140720560869342 0 and copies 147058 into SLOT_pid because that code no longer knows the real pid. i'll also add a concrete example from my laptop right now: /tmp/toybox$ ps -T 32190 PID SPID TTY STAT TIME COMMAND 32190 32190 ? Sl 0:00 /usr/bin/uplink-soecks 32190 32192 ? Sl 0:01 /usr/bin/uplink-soecks 32190 32193 ? Sl 0:00 /usr/bin/uplink-soecks 32190 32194 ? Sl 0:02 /usr/bin/uplink-soecks 32190 32197 ? Sl 0:02 /usr/bin/uplink-soecks 32190 32204 ? Sl 0:02 /usr/bin/uplink-soecks 32190 32205 ? Sl 0:00 /usr/bin/uplink-soecks 32190 32206 ? Sl 0:02 /usr/bin/uplink-soecks 32190 32207 ? Sl 0:00 /usr/bin/uplink-soecks 32190 21914 ? Sl 0:02 /usr/bin/uplink-soecks 32190 21915 ? Sl 0:01 /usr/bin/uplink-soecks 32190 31176 ? Sl 0:01 /usr/bin/uplink-soecks /tmp/toybox$ ./toybox ps -T 32190 PID TID TTY TIME CMD 32190 32190 ? 00:00:16 uplink-soecks /tmp/toybox$ > They're different? > > $ ls /proc/32667/task/ > 10087 32668 32670 32672 32674 5747 5754 > 32667 32669 32671 32673 32697 5748 > > And they're reasonable? > >>> but my tree has local c >>> changes in ps.c. Looks like I'm adding -m to show maximum number of >>> lines (somebody asked, it's easy enough.) >> >> i said "no" to the single internal request we had for that when we >> switched from traditional Android top to toybox top. easy, yes, but >> not obviously useful. the original Android top only had batch mode, so >> it was a bit more useful then. but "first N" isn't an obviously >> meaningful heuristic. "field X no lower than Y" would be more >> convincing. but that's no longer as easy :-) > > I can yank it again. If you wanna design a filter syntax I can probably > implement something. no, i don't care either way. i just wasn't going to implement something that i personally believe isn't useful (or at least "doesn't answer the question they actually have, and is prone to giving you too much or too little information"). it's two-year absence has weaned Google's Android folks off it, so i'll probably never be annoyed by a useless bug report caused by truncation again :-) > I've just started poking at bc ala shell $((blah)) math syntax, and I'm > likely to make a function or two that lets you substitute in variables > (via string substitution) and do math on the result. > > That said... that's not the syntax we've got in find, or in test. And ps > has: > > -o FIELDs instead of defaults, each with optional :size and =title > > I could add <XXX and >XXX to that, I suppose? Might take a bit of > fiddling to make room. > > That said, you still couldn't implement -m with a syntax like that. > Maximum number of fields to display isn't a -o field. :) yeah, but like i said: it's just not a useful thing anyway. if you genuinely just care about lines of text, use `head`. if you're using it as a proxy for something else, this feature would let you actually cut off at the right point. (so personally i'd have said "no -m, but generalized filtering is on the TODO list, waiting for enough people to have concrete use cases".) >>> And -H to iotop (which is >>> where I left off; need to come up with a test for this and haven't got >>> one. Is chrome threads or processes? The big scott mccloud comic implied >>> processes, but the way google does everything implies threads, but >>> threads would defeat the entire VM sandboxing purpose of having each tab >>> in its own process...) >> >> if you run (GNU) "ps -AT" while chrome's running, you'll see it's a >> mix. i think even the original design assumed that. see the first >> diagram here, where solid boxes are processes and dashed boxes are >> threads: >> https://www.chromium.org/developers/design-documents/multi-process-architecture > > I read Scott McCloud's comic way back when, and am currently on a plane > to tokyo and would have to pay for network access (it's not the money, > it's the visceral dislike of entering credit card information into a web > page; DO NOT TRUST), but I'll try to remember to take a look. > > Meanwhile, I believe you. :) > >> here's far more detail than you could possibly want (including the >> command-line options that let you configure the model): >> https://www.chromium.org/developers/design-documents/process-models > > I'm happy that as of a couple months back: > > ps ax | grep renderer | awk '{print $1}' | xargs kill i actually slightly miss that it no longer leaks itself to death every couple of weeks --- now i'm sometimes forced to manually restart it when the "time to upgrade" blob has been red too long. > works again. I haven't asked much deeper than that. (I remember the > videos of high speed lightning discharge vs chrome page rendering. My > experience on this netbook is more "under 5 seconds is pretty good, > under 15 is usually tolerable", but the stop button on chrome is UTTERLY > USELESS (doesn't even stop the notification craw at the bottom from > TELLING you all the things it's loading, often using the monthly 4 gig > data cap t-mobile applies to tethering but not to what the actual phone > uses because money; I'm checking to see I've got the right youtube link > before tweeting it, I _DON'T_ want you to spool 10 megabytes of video > data through a metered connection). > > The workaround is to right click disable wifi on the networkmangler icon > until it stops trying, usually about 15 seconds. One time out of a > thousand it gets Confused and your network goes away and CANNOT BE FIXED > until you reboot (ok, I've killed it, done the dbus status flush thing, > and respawned it from the command line successfully twice, and each time > was like half an hour of research _how_). But it was written by the same > guys who did pulseaudio and systemd so you can't expect reliablity out > of it.) At least these days chrome doesn't listen to the "network is > down" dbus notification and then refuse to show you pages from the web > server running on loopback. > > (Did I mention I break everything? Seriously. People kept trying to pull > me into a tester role for the first decade of my career... and the > combination of sleep deprivation and caffeine makes me REALLY CHATTY and > the airport shuttle arrived at 4:20am and last week I bought a 50 pack > of "driving chocolate" squares that are 150mg of caffeine each and > packed ALL OF THEM. Minus the ones I already ate.) see... i knew there was a reason you read the bug report backwards :-) > Anyway, what I want here is something with threads to to test against, > and both chrome and thunderbird have those, so.... > >> and here's a Chrome engineer's "it's complicated, and everyone >> misunderstands" post: >> https://plus.google.com/+PeterKasting/posts/TC4ACtKevJY > > I follow "Security Princess" on twitter (probably @laprissa? See "no net > right now" above). She blogs about this stuff from time to time. But I > copied that link into a tab for when I get to Akihabra. > > Rob > > P.S. 9 open reply windows to deal with before I can close thunderbird! Woo! -- Elliott Hughes - http://who/enh - http://jessies.org/~enh/ Android native code/tools questions? Mail me/drop by/add me as a reviewer. _______________________________________________ Toybox mailing list [email protected] http://lists.landley.net/listinfo.cgi/toybox-landley.net
