-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256 On Mon, Jul 24, 2017 at 12:30:25AM -0700, Andrew Morgan wrote: > On 07/23/2017 11:59 PM, Marek Marczykowski-Górecki wrote: > > On Sun, Jul 23, 2017 at 01:29:17PM -0700, Andrew Morgan wrote: > >> On 07/23/2017 12:58 PM, Jean-Philippe Ouellet wrote: > >>> On Sun, Jul 23, 2017 at 2:28 PM, Andrew Morgan > >>> <[email protected]> wrote: > >>>> On 07/20/2017 08:27 AM, Jean-Philippe Ouellet wrote: > >>>>> On Thu, Jul 20, 2017 at 11:22 AM, Jean-Philippe Ouellet > >>>>> <[email protected]> wrote: > >>>>>> On Thu, Jul 20, 2017 at 1:42 AM, Andrew Morgan > >>>>>> <[email protected]> wrote: > >>>>>>> Also did a test with moving in an enormous folder, daemon took up 16% > >>>>>>> CPU for a second in htop then right back to 0%, so seems pretty well > >>>>>>> optimized for now. inotify finds all the files and folders in way > >>>>>>> until > >>>>>>> a few hundred milli-seconds, so we may need to scale our period for > >>>>>>> calling qvm-file-trust with a list of files down a bit (unless python > >>>>>>> can take in 10K+ full filepaths as arguments). > >>>>>> > >>>>>> During exec(2), the kernel places arguments somewhere at the top of > >>>>>> the stack, along with your environment variables and some other stuff. > >>>>>> Thus, the limit is likely actually some number of total bytes (also > >>>>>> dependent on other things like the total size of your current > >>>>>> environment), rather than the limit being only a fixed number of > >>>>>> arguments. This means you would have to check not just the number of > >>>>>> arguments, but the sum of the lengths of each. > >>>>>> > >>>>>> If you find yourself running into problems with to much data in argv > >>>>>> for a single exec, you may wish to consider letting xargs handle > >>>>>> splitting the paths into an appropriate number of separate execs of > >>>>>> your python script. This is one of the reasons it exists. If you do > >>>>>> this, be sure to split the paths with '\0' and use xargs -0. > >>>>>> > >>>>>> Consider this example: > >>>>>> $ cat argc.c > >>>>>> #include <stdio.h> > >>>>>> int main(int argc) { printf("%d\n", argc); } > >>>>>> > >>>>>> $ make argc > >>>>>> cc argc.c -o argc > >>>>>> > >>>>>> $ yes AAAA | head -$((1024*100)) | xargs ./argc > >>>>>> 26214 > >>>>>> 26214 > >>>>>> 26214 > >>>>>> 23762 > >>>>>> > >>>>>> $ yes AAAAAAAAAAAA | head -$((1024*100)) | xargs ./argc > >>>>>> 10082 > >>>>>> 10082 > >>>>>> 10082 > >>>>>> 10082 > >>>>>> 10082 > >>>>>> 10082 > >>>>>> 10082 > >>>>>> 10082 > >>>>>> 10082 > >>>>>> 10082 > >>>>>> 1591 > >>>>>> > >>>>>> You may also wish to set an artificially small max length > >>>>> > >>>>> Either with xargs -s, or in your own script if you don't use xargs. > >>>>> The same concern exists either way. > >>>>> > >>>>> ISTM that being extra cautious at the expense of a few extra execs is > >>>>> a good trade-off. If performance really mattered you wouldn't be > >>>>> execing in the first place. > >>>>> > >>>>>> to guard > >>>>>> against any potential edge cases which xargs itself may have or may > >>>>>> develop in the future which may cause final arguments to get dropped > >>>>>> or truncated, as such bugs may be unlikely to be found and may have > >>>>>> very bad consequences (files not being marked as untrusted). > >>>>>> > >>>>>> Cheers, > >>>>>> Jean-Philippe > >>>>> > >>>> > >>>> So the exec* family of C functions separates char pointers by spaces, > >>> > >>> Err... not sure what you mean by that. Perhaps you are confusing > >>> exec's behavior with echo's? > >>> > >>>> and it doesn't seem to be configurable, thus I may have to keep the > >>>> space separation but escape spaces in the argument list. > >>>> > >>>> user@dev$ echo "hello there" this is a test for many words and xargs in > >>>> one go | xargs -s 24 ./argc > >>>> 5 > >>>> 5 > >>>> 4 > >>>> 3 > >>>> 2 > >>>> user@dev$ echo "hello\ there" this is a test for many words and xargs in > >>>> one go | xargs -s 24 ./argc > >>>> 4 > >>>> 5 > >>>> 4 > >>>> 3 > >>>> 2 > >>>> > >>>> I'll note it _only_ works if there is a preceding backslash and the > >>>> words are surrounded by double-quotes. > >>> > >>> That's why I suggested using xargs -0 and splitting filenames with '\0'. > > > >> Er, sorry. I'm referring to the following lines of C/C++ code I have in > >> qubes-trust-daemon: > > > >> // Fork and attempt to call qvm-file-trust > >> switch (child_pid=fork()) { > >> case 0: > >> // We're the child, call qvm-file-trust > >> execv("/usr/bin/qvm-file-trust", (char **) qvm_argv); > > > >> // Unreachable if no error > >> perror("execl qvm-file-trust"); > >> exit(1); > >> ... > > > >> Filenames are contained within qvm_argv and are passed to qvm-file-trust > >> as arguments separated by space. > > > > And here you are wrong. Arguments are passed exactly as you given them > > in qvm_argv, regardless of spaces inside. > > Also, remember that argv[0] should be a program name... > > > > > > Yes, a few lines before argv[0] is set to the program name and any > arguments are included before file paths are pumped into the array :) > > As for the arguments situation, if I understand correctly execv launches > a program and hands it its *argv array as if taken from stdin. If that > is so, would it then be correct to simply instead use execl in the > following manner? > > execl("/usr/bin/xargs", "-0", "file name 1\0filename 2\0filename3\0") > > Essentially just one long string of all arguments, the final string > being built through a loop?
The above will pass it as an argument, instead of stdin. xargs is for transforming data from stdin into argv, it would require setting pipe for stdin, then writing to it. But ... > That should work if I'm understanding it > correctly (correct me if I'm not), though not sure if an enormous string > of many, many filenames is desirable. > > Alternatively you could simply use execv and insert a null character at > the end of each string (presumably there's one already there though). > > While toying around with above solutions, I in the mean time built a > simple splitter in C++ that would handle running a separate instance of > qvm-file-trust after a configurable amount of arguments had been passed. > You can find that implementation here: > > https://github.com/anoadragon453/qubes-mime-types/blob/master/qubes-trust-daemon.cpp#L115 ... this looks simpler. Two additional things there: 1. Check total length of arguments (sum of strlen(file_path). AFAIR sensible limit should be about 32k. You can simply break from the inner loop when the limit is reached. Also the outer loop condition could "it != file_paths.end()", instead of iteration counter. 2. You should check (and at least log non-zero) exit code of qvm-file-trust. Failing to mark untrusted file as untrusted may result in opening it locally and exploiting some bug in local editor, so this case should have some serious error handling (maybe even removing such file?). > I'm not entirely sure if that is enough or if a solution such as xargs > that adjusts to system conditions is necessary, but for now I can say > that the splitter is enough to actually now have qubes-trust-daemon > function properly. It now watches folders and subfolders moved into > watched folders, as well as marking any file it comes across as > untrusted along the way. > > I also changed the daemon to mark all files in untrusted directories as > untrusted upon first run, to ensure that if the user added a new > untrusted directory while the daemon was offline, then the files within > would still be marked appropriately. +1 - -- Best Regards, Marek Marczykowski-Górecki Invisible Things Lab A: Because it messes up the order in which people normally read text. Q: Why is top-posting such a bad thing? -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQEcBAEBCAAGBQJZdcR4AAoJENuP0xzK19cs1mMH+QGLPrnXzalARNJ8iZFwwtWt ArsyZGnnymjQ7jpj3TLfFt9RmZ1V0ZD5tFNDYeumWni59R5+rZgsIf0sYNjSvqsF mNe3LFJBxzLr3vMXliKwTQqY/V3ObEpBDmFoRK1c7bkNz+kybjaXFzUKxlv30NNN UU781COymYNQosGFJCvw7onYFt6IHW3rbD8Uc70sujIR2UMuvTI6b7qWGjB2vKsq pgM49CXfJBG4s7sNmdyM+hN4x+J6y1NiEtItmPerO5FZrZhooSYGHy1y5c0aBSVE LdjI1ZIq6XhwAM+MdkYhissDZ+nUAQmz9JpHMHJOL85fHCOZnu3yO79Fpoycoos= =dXbV -----END PGP SIGNATURE----- -- You received this message because you are subscribed to the Google Groups "qubes-devel" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/qubes-devel/20170724095711.GT1095%40mail-itl. For more options, visit https://groups.google.com/d/optout.
