Re: error: Invalid message (magic 0x00000000).

2016-10-28 Thread Eduardo Bustamante
Vegard: What version of screen are you using? (screen -v). I suspect
you're using screen compiled to use named pipes instead of unix domain
sockets.

My guess at the moment is this:

- Since you're running these commands inside an already existing
screen session (i.e. STY defined), instead of launching new screen
processes, it will ask the already running window manager to launch
new windows
- The way this works is that the new screen processes communicate with
the window manager, via unix domain socket or named pipe, depending on
how screen was compiled. The screen opens the write end of the
socket/pipe (the window manager process has the read end open), and
sends a "struct msg" message down the socket/pipe. In my system, this
message is 12584 bytes long i.e. sizeof(struct msg). The message
starts with a magic 4 byte (int) sequence, the chars 'm', 's', 'g',
and then the protocol revision number (currently at version 5). This
sequence is defined as MSG_REVISION in screen.h
- If you're launching one window at a time, this poses no problem. The
issue arises when you have multiple screen processes writing to the
same pipe concurrently. POSIX mandates that writes to a pipe of less
than/equal than PIPE_BUF (4096 bytes, getconf PIPE_BUF /) be atomic.
There are no guarantees of atomicity if you do larger writes. See [1].
I suspect that writes are getting interleaved, and that's what causes
the magic sequence check to fail (it's reading a part of another
message, and from what I've seen the messages are mainly 0's, which
explains the 0x).

My suggestion here is to compile screen from git source (branch
screen-v4), make sure that the output of the configure script shows
that it's using unix domain sockets and try again. There should be no
interleaving if you're using unix domain sockets.

Please let me know if that works for you.

[1] https://www.gnu.org/software/libc/manual/html_node/Pipe-Atomicity.html
[2] http://manpages.courier-mta.org/htmlman7/pipe.7.html#pipe-7_sect4

___
screen-users mailing list
screen-users@gnu.org
https://lists.gnu.org/mailman/listinfo/screen-users


Re: error: Invalid message (magic 0x00000000).

2016-10-19 Thread Aleksey Tsalolikhin
As a workaround, try putting a sleep 1 between launching screens?

On Tue, Oct 18, 2016 at 4:02 AM, Vegard Nossum 
wrote:

> Hi,
>
> When starting a lot of new screen windows in parallel, some windows
> don't actually open, and I get some errors instead:
>
> Invalid message (magic 0x).
> Invalid message (magic 0x736e6f69).
>
> (the latter seems to be ASCII "snoi").
>
> You can reproduce it e.g. this Python script:
>
> 8<---
> import os
> import shutil
> import sys
> import subprocess
> import time
>
> if not os.getenv('STY'):
>print >>sys.stderr, 'Need to run inside an existing screen session'
>sys.exit(1)
>
> # create pids/
> if os.path.exists('pids'):
>shutil.rmtree('pids')
> os.mkdir('pids')
>
> expected_n = 100
> for i in range(expected_n):
>subprocess.check_call(['screen', 'bash', '-c', 'echo 1 > pids/$$'])
>
> # give the processes above some time to start
> time.sleep(5)
>
> actual_n = len(os.listdir('pids'))
>
> print "expected %u, got %u" % (expected_n, actual_n)
> 8<---
>
> This will mostly print "expected 100, got 100", but sometimes it
> prints "expected 100, got 99".
>
> I've only found this earlier report (from 2005) of something similar:
>
> https://lists.gnu.org/archive/html/screen-users/2005-01/msg00057.html
>
> They are also starting a lot of windows in parallel, i.e.: "I'm
> writing a system that puts a lot of windows (about 12) into a screen
> session".
>
> Are there any recent fixes or workarounds for this problem?
>
> Thanks,
>
>
> Vegard
>
> ___
> screen-users mailing list
> screen-users@gnu.org
> https://lists.gnu.org/mailman/listinfo/screen-users
>



-- 
Need CFEngine training?  Email train...@verticalsysadmin.com
___
screen-users mailing list
screen-users@gnu.org
https://lists.gnu.org/mailman/listinfo/screen-users


error: Invalid message (magic 0x00000000).

2016-10-18 Thread Vegard Nossum
Hi,

When starting a lot of new screen windows in parallel, some windows
don't actually open, and I get some errors instead:

Invalid message (magic 0x).
Invalid message (magic 0x736e6f69).

(the latter seems to be ASCII "snoi").

You can reproduce it e.g. this Python script:

8<---
import os
import shutil
import sys
import subprocess
import time

if not os.getenv('STY'):
   print >>sys.stderr, 'Need to run inside an existing screen session'
   sys.exit(1)

# create pids/
if os.path.exists('pids'):
   shutil.rmtree('pids')
os.mkdir('pids')

expected_n = 100
for i in range(expected_n):
   subprocess.check_call(['screen', 'bash', '-c', 'echo 1 > pids/$$'])

# give the processes above some time to start
time.sleep(5)

actual_n = len(os.listdir('pids'))

print "expected %u, got %u" % (expected_n, actual_n)
8<---

This will mostly print "expected 100, got 100", but sometimes it
prints "expected 100, got 99".

I've only found this earlier report (from 2005) of something similar:

https://lists.gnu.org/archive/html/screen-users/2005-01/msg00057.html

They are also starting a lot of windows in parallel, i.e.: "I'm
writing a system that puts a lot of windows (about 12) into a screen
session".

Are there any recent fixes or workarounds for this problem?

Thanks,


Vegard

___
screen-users mailing list
screen-users@gnu.org
https://lists.gnu.org/mailman/listinfo/screen-users