On 07/20/2017 08:27 AM, Jean-Philippe Ouellet wrote:
> On Thu, Jul 20, 2017 at 11:22 AM, Jean-Philippe Ouellet 
> <[email protected]> wrote:
>> On Thu, Jul 20, 2017 at 1:42 AM, Andrew Morgan 
>> <[email protected]> wrote:
>>> Also did a test with moving in an enormous folder, daemon took up 16%
>>> CPU for a second in htop then right back to 0%, so seems pretty well
>>> optimized for now. inotify finds all the files and folders in way until
>>> a few hundred milli-seconds, so we may need to scale our period for
>>> calling qvm-file-trust with a list of files down a bit (unless python
>>> can take in 10K+ full filepaths as arguments).
>>
>> During exec(2), the kernel places arguments somewhere at the top of
>> the stack, along with your environment variables and some other stuff.
>> Thus, the limit is likely actually some number of total bytes (also
>> dependent on other things like the total size of your current
>> environment), rather than the limit being only a fixed number of
>> arguments. This means you would have to check not just the number of
>> arguments, but the sum of the lengths of each.
>>
>> If you find yourself running into problems with to much data in argv
>> for a single exec, you may wish to consider letting xargs handle
>> splitting the paths into an appropriate number of separate execs of
>> your python script. This is one of the reasons it exists. If you do
>> this, be sure to split the paths with '\0' and use xargs -0.
>>
>> Consider this example:
>> $ cat argc.c
>> #include <stdio.h>
>> int main(int argc) { printf("%d\n", argc); }
>>
>> $ make argc
>> cc     argc.c   -o argc
>>
>> $ yes AAAA | head -$((1024*100)) | xargs ./argc
>> 26214
>> 26214
>> 26214
>> 23762
>>
>> $ yes AAAAAAAAAAAA | head -$((1024*100)) | xargs ./argc
>> 10082
>> 10082
>> 10082
>> 10082
>> 10082
>> 10082
>> 10082
>> 10082
>> 10082
>> 10082
>> 1591
>>
>> You may also wish to set an artificially small max length
> 
> Either with xargs -s, or in your own script if you don't use xargs.
> The same concern exists either way.
> 
> ISTM that being extra cautious at the expense of a few extra execs is
> a good trade-off. If performance really mattered you wouldn't be
> execing in the first place.
> 
>> to guard
>> against any potential edge cases which xargs itself may have or may
>> develop in the future which may cause final arguments to get dropped
>> or truncated, as such bugs may be unlikely to be found and may have
>> very bad consequences (files not being marked as untrusted).
>>
>> Cheers,
>> Jean-Philippe
> 

So the exec* family of C functions separates char pointers by spaces,
and it doesn't seem to be configurable, thus I may have to keep the
space separation but escape spaces in the argument list.

user@dev$ echo "hello there" this is a test for many words and xargs in
one go | xargs -s 24 ./argc
5
5
4
3
2
user@dev$ echo "hello\ there" this is a test for many words and xargs in
one go | xargs -s 24 ./argc
4
5
4
3
2

I'll note it _only_ works if there is a preceding backslash and the
words are surrounded by double-quotes.

Again I'm not entirely sure if a workaround for the large amount of
arguments I'm handing python is needed, but one strong benefit of using
xargs (or a similar method) is the ability to split up the list and
parallel calls. Since the script simply marks each file as untrusted
when called from the daemon, it should be fine to parallelize.

Another blocker is that the script current sys.exit's on an error. This
behavior is undesirable as depending on which file errors out, all
subsequent files will not attempt to be marked. In this case, is it best
to thus catch the call, store the error number, then return the error at
the end of processing all files regardless of what it may be?

As long as that number is only overridden by erratic behavior, then any
script calling it should detect the non-zero return code and act
accordingly. The only issue would be if the calling script attempted to
act based on the specific error number which may be overridden during
execution with a later error.

Probably not something we need to worry about at the moment, continuing
execution is more important, just wanted to get people's thoughts on it.

Andrew Morgan

-- 
You received this message because you are subscribed to the Google Groups 
"qubes-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/qubes-devel/ol2pt2%242dn%241%40blaine.gmane.org.
For more options, visit https://groups.google.com/d/optout.

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to