On 07/20/2017 08:27 AM, Jean-Philippe Ouellet wrote: > On Thu, Jul 20, 2017 at 11:22 AM, Jean-Philippe Ouellet > <[email protected]> wrote: >> On Thu, Jul 20, 2017 at 1:42 AM, Andrew Morgan >> <[email protected]> wrote: >>> Also did a test with moving in an enormous folder, daemon took up 16% >>> CPU for a second in htop then right back to 0%, so seems pretty well >>> optimized for now. inotify finds all the files and folders in way until >>> a few hundred milli-seconds, so we may need to scale our period for >>> calling qvm-file-trust with a list of files down a bit (unless python >>> can take in 10K+ full filepaths as arguments). >> >> During exec(2), the kernel places arguments somewhere at the top of >> the stack, along with your environment variables and some other stuff. >> Thus, the limit is likely actually some number of total bytes (also >> dependent on other things like the total size of your current >> environment), rather than the limit being only a fixed number of >> arguments. This means you would have to check not just the number of >> arguments, but the sum of the lengths of each. >> >> If you find yourself running into problems with to much data in argv >> for a single exec, you may wish to consider letting xargs handle >> splitting the paths into an appropriate number of separate execs of >> your python script. This is one of the reasons it exists. If you do >> this, be sure to split the paths with '\0' and use xargs -0. >> >> Consider this example: >> $ cat argc.c >> #include <stdio.h> >> int main(int argc) { printf("%d\n", argc); } >> >> $ make argc >> cc argc.c -o argc >> >> $ yes AAAA | head -$((1024*100)) | xargs ./argc >> 26214 >> 26214 >> 26214 >> 23762 >> >> $ yes AAAAAAAAAAAA | head -$((1024*100)) | xargs ./argc >> 10082 >> 10082 >> 10082 >> 10082 >> 10082 >> 10082 >> 10082 >> 10082 >> 10082 >> 10082 >> 1591 >> >> You may also wish to set an artificially small max length > > Either with xargs -s, or in your own script if you don't use xargs. > The same concern exists either way. > > ISTM that being extra cautious at the expense of a few extra execs is > a good trade-off. If performance really mattered you wouldn't be > execing in the first place. > >> to guard >> against any potential edge cases which xargs itself may have or may >> develop in the future which may cause final arguments to get dropped >> or truncated, as such bugs may be unlikely to be found and may have >> very bad consequences (files not being marked as untrusted). >> >> Cheers, >> Jean-Philippe >
So the exec* family of C functions separates char pointers by spaces, and it doesn't seem to be configurable, thus I may have to keep the space separation but escape spaces in the argument list. user@dev$ echo "hello there" this is a test for many words and xargs in one go | xargs -s 24 ./argc 5 5 4 3 2 user@dev$ echo "hello\ there" this is a test for many words and xargs in one go | xargs -s 24 ./argc 4 5 4 3 2 I'll note it _only_ works if there is a preceding backslash and the words are surrounded by double-quotes. Again I'm not entirely sure if a workaround for the large amount of arguments I'm handing python is needed, but one strong benefit of using xargs (or a similar method) is the ability to split up the list and parallel calls. Since the script simply marks each file as untrusted when called from the daemon, it should be fine to parallelize. Another blocker is that the script current sys.exit's on an error. This behavior is undesirable as depending on which file errors out, all subsequent files will not attempt to be marked. In this case, is it best to thus catch the call, store the error number, then return the error at the end of processing all files regardless of what it may be? As long as that number is only overridden by erratic behavior, then any script calling it should detect the non-zero return code and act accordingly. The only issue would be if the calling script attempted to act based on the specific error number which may be overridden during execution with a later error. Probably not something we need to worry about at the moment, continuing execution is more important, just wanted to get people's thoughts on it. Andrew Morgan -- You received this message because you are subscribed to the Google Groups "qubes-devel" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/qubes-devel/ol2pt2%242dn%241%40blaine.gmane.org. For more options, visit https://groups.google.com/d/optout.
signature.asc
Description: OpenPGP digital signature
