Nicola Cocchiaro <nicola.cocchi...@gmail.com> writes:
> On Tuesday, March 11, 2014 8:06:42 PM UTC-7, Nikolaus Rath wrote:
>>
>> Hi Nicola, 
>>
>> Nicola Cocchiaro writes: 
>> > The reason I originally asked was due to seeing some outbound 
>> connections 
>> > not completing but just hanging, until using umount.s3ql would let them 
>> > return with a TimeoutError (no less than 15 minutes later in all cases 
>> > seen). I was not able to dig much deeper at the time, but to experiment 
>> > more I put together a simple patch to add a configurable socket timeout 
>> to 
>> > all S3ql tools that may make use of it. I've attached it if you'd like 
>> to 
>> > consider it. 
>>
>> Thanks for the patch! I'm generally rather reluctant to add new 
>> command-line options unless they are absolutely crucial. The problem is 
>> that the number of possible configurations (and potential interaction 
>> bug) goes up exponentially with every option. 
>>
>> In this case, I am not sure I fully understand in which situation this 
>> option is intended to be used (so I'm pretty sure that a regular user 
>> wouldn't know when to use it either, which is always a bad sign for a 
>> command line argument). Could you give some additional details on that? 
>>
>> For example, if I'm not happy with the system timeout (which seems to be 
>> 15 minutes in your case), shouldn't this be adjusted on the os level as 
>> well? And if not, is there really a need to make the timeout 
>> configurable rather than having S3QL simply use an internal default? 
>>
>
>
> The problem is that there was no apparent system timeout, or those 
> connections did not seem to be timing out on their own in the cases 
> observed; I did not have the means at the time to figure out why exactly 
> this would happen, but the root cause of all this was a temporary 
> malfunction on the Google Storage side. The 15 minutes come from a 
> different process which had its own timeout (15 minutes in fact) for 
> allowing S3ql to unmount; in response to the timeout firing it called 
> umount.s3ql again, and that in turn seemed to allow the connections to be 
> recognized as timed out (possibly in response to sending SIGTERM to the 
> mount.s3ql process? This was my theory after looking at the timing from a 
> number of logs but again, unfortunately I do not have 100% solid evidence 
> that this was the reason).

Hmm. I don't think this is likely. umount.s3ql does not send SIGTERM. It
sets a termination flag that is checked for the in main file system
loop. So just calling umount.s3ql would not result in any pending socket
operations to terminate.

Is there any way to reproduce the problem you had? 

> A static, internal default may be enough and in fact it helped when I
> first tried it, but more advanced users may want to adapt the timeout
> to their own use case when relying on the OS doesn't seem to help like
> in the cases observed. I understand the reluctance to add more options
> and increase complexity for all users, but I thought I'd share this
> patch for consideration, perhaps as an extra option in a set of
> clearly marked "advanced" options.

Understood, thanks. I don't want to rule out such an option yet, but if
we add it, add the very least there should be some documentation
explaining what exactly the option does, and when it should be used. At
the moment, this seems rather unclear to me (see above).


Best,
-Nikolaus

-- 
Encrypted emails preferred.
PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6  02CF A9AD B7F8 AE4E 425C

             »Time flies like an arrow, fruit flies like a Banana.«

-- 
You received this message because you are subscribed to the Google Groups 
"s3ql" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to s3ql+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to