On 07/24/2017 02:57 AM, Marek Marczykowski-Górecki wrote:
> On Mon, Jul 24, 2017 at 12:30:25AM -0700, Andrew Morgan wrote:
>> On 07/23/2017 11:59 PM, Marek Marczykowski-Górecki wrote:
>>> On Sun, Jul 23, 2017 at 01:29:17PM -0700, Andrew Morgan wrote:
>>>> On 07/23/2017 12:58 PM, Jean-Philippe Ouellet wrote:
>>>>> On Sun, Jul 23, 2017 at 2:28 PM, Andrew Morgan 
>>>>> <[email protected]> wrote:
>>>>>> On 07/20/2017 08:27 AM, Jean-Philippe Ouellet wrote:
>>>>>>> On Thu, Jul 20, 2017 at 11:22 AM, Jean-Philippe Ouellet 
>>>>>>> <[email protected]> wrote:
>>>>>>>> On Thu, Jul 20, 2017 at 1:42 AM, Andrew Morgan 
>>>>>>>> <[email protected]> wrote:
>>>>>>>>> Also did a test with moving in an enormous folder, daemon took up 16%
>>>>>>>>> CPU for a second in htop then right back to 0%, so seems pretty well
>>>>>>>>> optimized for now. inotify finds all the files and folders in way 
>>>>>>>>> until
>>>>>>>>> a few hundred milli-seconds, so we may need to scale our period for
>>>>>>>>> calling qvm-file-trust with a list of files down a bit (unless python
>>>>>>>>> can take in 10K+ full filepaths as arguments).
>>>>>>>>
>>>>>>>> During exec(2), the kernel places arguments somewhere at the top of
>>>>>>>> the stack, along with your environment variables and some other stuff.
>>>>>>>> Thus, the limit is likely actually some number of total bytes (also
>>>>>>>> dependent on other things like the total size of your current
>>>>>>>> environment), rather than the limit being only a fixed number of
>>>>>>>> arguments. This means you would have to check not just the number of
>>>>>>>> arguments, but the sum of the lengths of each.
>>>>>>>>
>>>>>>>> If you find yourself running into problems with to much data in argv
>>>>>>>> for a single exec, you may wish to consider letting xargs handle
>>>>>>>> splitting the paths into an appropriate number of separate execs of
>>>>>>>> your python script. This is one of the reasons it exists. If you do
>>>>>>>> this, be sure to split the paths with '\0' and use xargs -0.
>>>>>>>>
>>>>>>>> Consider this example:
>>>>>>>> $ cat argc.c
>>>>>>>> #include <stdio.h>
>>>>>>>> int main(int argc) { printf("%d\n", argc); }
>>>>>>>>
>>>>>>>> $ make argc
>>>>>>>> cc     argc.c   -o argc
>>>>>>>>
>>>>>>>> $ yes AAAA | head -$((1024*100)) | xargs ./argc
>>>>>>>> 26214
>>>>>>>> 26214
>>>>>>>> 26214
>>>>>>>> 23762
>>>>>>>>
>>>>>>>> $ yes AAAAAAAAAAAA | head -$((1024*100)) | xargs ./argc
>>>>>>>> 10082
>>>>>>>> 10082
>>>>>>>> 10082
>>>>>>>> 10082
>>>>>>>> 10082
>>>>>>>> 10082
>>>>>>>> 10082
>>>>>>>> 10082
>>>>>>>> 10082
>>>>>>>> 10082
>>>>>>>> 1591
>>>>>>>>
>>>>>>>> You may also wish to set an artificially small max length
>>>>>>>
>>>>>>> Either with xargs -s, or in your own script if you don't use xargs.
>>>>>>> The same concern exists either way.
>>>>>>>
>>>>>>> ISTM that being extra cautious at the expense of a few extra execs is
>>>>>>> a good trade-off. If performance really mattered you wouldn't be
>>>>>>> execing in the first place.
>>>>>>>
>>>>>>>> to guard
>>>>>>>> against any potential edge cases which xargs itself may have or may
>>>>>>>> develop in the future which may cause final arguments to get dropped
>>>>>>>> or truncated, as such bugs may be unlikely to be found and may have
>>>>>>>> very bad consequences (files not being marked as untrusted).
>>>>>>>>
>>>>>>>> Cheers,
>>>>>>>> Jean-Philippe
>>>>>>>
>>>>>>
>>>>>> So the exec* family of C functions separates char pointers by spaces,
>>>>>
>>>>> Err... not sure what you mean by that. Perhaps you are confusing
>>>>> exec's behavior with echo's?
>>>>>
>>>>>> and it doesn't seem to be configurable, thus I may have to keep the
>>>>>> space separation but escape spaces in the argument list.
>>>>>>
>>>>>> user@dev$ echo "hello there" this is a test for many words and xargs in
>>>>>> one go | xargs -s 24 ./argc
>>>>>> 5
>>>>>> 5
>>>>>> 4
>>>>>> 3
>>>>>> 2
>>>>>> user@dev$ echo "hello\ there" this is a test for many words and xargs in
>>>>>> one go | xargs -s 24 ./argc
>>>>>> 4
>>>>>> 5
>>>>>> 4
>>>>>> 3
>>>>>> 2
>>>>>>
>>>>>> I'll note it _only_ works if there is a preceding backslash and the
>>>>>> words are surrounded by double-quotes.
>>>>>
>>>>> That's why I suggested using xargs -0 and splitting filenames with '\0'.
>>>
>>>> Er, sorry. I'm referring to the following lines of C/C++ code I have in
>>>> qubes-trust-daemon:
>>>
>>>>      // Fork and attempt to call qvm-file-trust
>>>>      switch (child_pid=fork()) {
>>>>          case 0:
>>>>              // We're the child, call qvm-file-trust
>>>>              execv("/usr/bin/qvm-file-trust", (char **) qvm_argv);
>>>
>>>>              // Unreachable if no error
>>>>              perror("execl qvm-file-trust");
>>>>              exit(1);
>>>> ...
>>>
>>>> Filenames are contained within qvm_argv and are passed to qvm-file-trust
>>>> as arguments separated by space. 
>>>
>>> And here you are wrong. Arguments are passed exactly as you given them
>>> in qvm_argv, regardless of spaces inside.
>>> Also, remember that argv[0] should be a program name...
>>>
>>>
> 
>> Yes, a few lines before argv[0] is set to the program name and any
>> arguments are included before file paths are pumped into the array :)
> 
>> As for the arguments situation, if I understand correctly execv launches
>> a program and hands it its *argv array as if taken from stdin. If that
>> is so, would it then be correct to simply instead use execl in the
>> following manner?
> 
>>     execl("/usr/bin/xargs", "-0", "file name 1\0filename 2\0filename3\0")
> 
>> Essentially just one long string of all arguments, the final string
>> being built through a loop? 
> 
> The above will pass it as an argument, instead of stdin. xargs is for
> transforming data from stdin into argv, it would require setting pipe
> for stdin, then writing to it.
> 
> But ...
> 
>> That should work if I'm understanding it
>> correctly (correct me if I'm not), though not sure if an enormous string
>> of many, many filenames is desirable.
> 
>> Alternatively you could simply use execv and insert a null character at
>> the end of each string (presumably there's one already there though).
> 
>> While toying around with above solutions, I in the mean time built a
>> simple splitter in C++ that would handle running a separate instance of
>> qvm-file-trust after a configurable amount of arguments had been passed.
>> You can find that implementation here:
> 
>> https://github.com/anoadragon453/qubes-mime-types/blob/master/qubes-trust-daemon.cpp#L115
> 
> ... this looks simpler.
> Two additional things there:
> 
> 1. Check total length of arguments (sum of strlen(file_path). AFAIR
> sensible limit should be about 32k. You can simply break from the inner
> loop when the limit is reached. Also the outer loop condition could "it
> != file_paths.end()", instead of iteration counter.

Yes, that makes sense.

> 
> 2. You should check (and at least log non-zero) exit code of
> qvm-file-trust. Failing to mark untrusted file as untrusted may result
> in opening it locally and exploiting some bug in local editor, so this
> case should have some serious error handling (maybe even removing such
> file?).

Hm, we could quarantine it perhaps? Maybe in some folder TrustQuarantine
a file ~/Downloads/folder/file could be moved to
TrustQuarantine/home/user/Downloads/folder/file. This way the user knows
where the file was originally, without us having to touch the file in
any way.

> 
>> I'm not entirely sure if that is enough or if a solution such as xargs
>> that adjusts to system conditions is necessary, but for now I can say
>> that the splitter is enough to actually now have qubes-trust-daemon
>> function properly. It now watches folders and subfolders moved into
>> watched folders, as well as marking any file it comes across as
>> untrusted along the way.
> 
>> I also changed the daemon to mark all files in untrusted directories as
>> untrusted upon first run, to ensure that if the user added a new
>> untrusted directory while the daemon was offline, then the files within
>> would still be marked appropriately.
> 
> +1
> 
> 


-- 
You received this message because you are subscribed to the Google Groups 
"qubes-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/qubes-devel/ol4ghu%24ocr%241%40blaine.gmane.org.
For more options, visit https://groups.google.com/d/optout.

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to