Re: error: Invalid message (magic 0x00000000).
Vegard: What version of screen are you using? (screen -v). I suspect you're using screen compiled to use named pipes instead of unix domain sockets. My guess at the moment is this: - Since you're running these commands inside an already existing screen session (i.e. STY defined), instead of launching new screen processes, it will ask the already running window manager to launch new windows - The way this works is that the new screen processes communicate with the window manager, via unix domain socket or named pipe, depending on how screen was compiled. The screen opens the write end of the socket/pipe (the window manager process has the read end open), and sends a "struct msg" message down the socket/pipe. In my system, this message is 12584 bytes long i.e. sizeof(struct msg). The message starts with a magic 4 byte (int) sequence, the chars 'm', 's', 'g', and then the protocol revision number (currently at version 5). This sequence is defined as MSG_REVISION in screen.h - If you're launching one window at a time, this poses no problem. The issue arises when you have multiple screen processes writing to the same pipe concurrently. POSIX mandates that writes to a pipe of less than/equal than PIPE_BUF (4096 bytes, getconf PIPE_BUF /) be atomic. There are no guarantees of atomicity if you do larger writes. See [1]. I suspect that writes are getting interleaved, and that's what causes the magic sequence check to fail (it's reading a part of another message, and from what I've seen the messages are mainly 0's, which explains the 0x). My suggestion here is to compile screen from git source (branch screen-v4), make sure that the output of the configure script shows that it's using unix domain sockets and try again. There should be no interleaving if you're using unix domain sockets. Please let me know if that works for you. [1] https://www.gnu.org/software/libc/manual/html_node/Pipe-Atomicity.html [2] http://manpages.courier-mta.org/htmlman7/pipe.7.html#pipe-7_sect4 ___ screen-users mailing list screen-users@gnu.org https://lists.gnu.org/mailman/listinfo/screen-users
Re: error: Invalid message (magic 0x00000000).
As a workaround, try putting a sleep 1 between launching screens? On Tue, Oct 18, 2016 at 4:02 AM, Vegard Nossum wrote: > Hi, > > When starting a lot of new screen windows in parallel, some windows > don't actually open, and I get some errors instead: > > Invalid message (magic 0x). > Invalid message (magic 0x736e6f69). > > (the latter seems to be ASCII "snoi"). > > You can reproduce it e.g. this Python script: > > 8<--- > import os > import shutil > import sys > import subprocess > import time > > if not os.getenv('STY'): >print >>sys.stderr, 'Need to run inside an existing screen session' >sys.exit(1) > > # create pids/ > if os.path.exists('pids'): >shutil.rmtree('pids') > os.mkdir('pids') > > expected_n = 100 > for i in range(expected_n): >subprocess.check_call(['screen', 'bash', '-c', 'echo 1 > pids/$$']) > > # give the processes above some time to start > time.sleep(5) > > actual_n = len(os.listdir('pids')) > > print "expected %u, got %u" % (expected_n, actual_n) > 8<--- > > This will mostly print "expected 100, got 100", but sometimes it > prints "expected 100, got 99". > > I've only found this earlier report (from 2005) of something similar: > > https://lists.gnu.org/archive/html/screen-users/2005-01/msg00057.html > > They are also starting a lot of windows in parallel, i.e.: "I'm > writing a system that puts a lot of windows (about 12) into a screen > session". > > Are there any recent fixes or workarounds for this problem? > > Thanks, > > > Vegard > > ___ > screen-users mailing list > screen-users@gnu.org > https://lists.gnu.org/mailman/listinfo/screen-users > -- Need CFEngine training? Email train...@verticalsysadmin.com ___ screen-users mailing list screen-users@gnu.org https://lists.gnu.org/mailman/listinfo/screen-users
error: Invalid message (magic 0x00000000).
Hi, When starting a lot of new screen windows in parallel, some windows don't actually open, and I get some errors instead: Invalid message (magic 0x). Invalid message (magic 0x736e6f69). (the latter seems to be ASCII "snoi"). You can reproduce it e.g. this Python script: 8<--- import os import shutil import sys import subprocess import time if not os.getenv('STY'): print >>sys.stderr, 'Need to run inside an existing screen session' sys.exit(1) # create pids/ if os.path.exists('pids'): shutil.rmtree('pids') os.mkdir('pids') expected_n = 100 for i in range(expected_n): subprocess.check_call(['screen', 'bash', '-c', 'echo 1 > pids/$$']) # give the processes above some time to start time.sleep(5) actual_n = len(os.listdir('pids')) print "expected %u, got %u" % (expected_n, actual_n) 8<--- This will mostly print "expected 100, got 100", but sometimes it prints "expected 100, got 99". I've only found this earlier report (from 2005) of something similar: https://lists.gnu.org/archive/html/screen-users/2005-01/msg00057.html They are also starting a lot of windows in parallel, i.e.: "I'm writing a system that puts a lot of windows (about 12) into a screen session". Are there any recent fixes or workarounds for this problem? Thanks, Vegard ___ screen-users mailing list screen-users@gnu.org https://lists.gnu.org/mailman/listinfo/screen-users