--- Begin Message ---
Package: openswan
Version: 1:2.4.4-1
Severity: important
Tags: patch
I was trying to get OPENSWAN working on my trusty SPARC in my basement
and ran into a problem where the helper process that PLUTO started to
allow it to asynchronously do crypto requests would die. I saw this in
syslog:
########################################################################
Dec 1 11:54:25 henkle pluto[2072]: | 0: w->pcw_dead: 0 w->pcw_work: 0 cnt: 1
Dec 1 11:54:25 henkle pluto[7976]: | opening /dev/urandom
Dec 1 11:54:25 henkle pluto[2072]: "Aztlan"[6] 192.168.0.100 #9: started
helper pid=7976 (fd:6)
Dec 1 11:54:25 henkle pluto[7976]: "Aztlan"[6] 192.168.0.100 #9: forgetting
secrets
Dec 1 11:54:25 henkle pluto[2072]: | asking helper 0 to do build_kenonce op on
seq: 8
Dec 1 11:54:25 henkle pluto[7976]: ! helper 0 waiting on fd: 7
Dec 1 11:54:25 henkle pluto[2072]: | inserting event EVENT_CRYPTO_FAILED,
timeout in 300 seconds for #9
Dec 1 11:54:25 henkle pluto[2072]: | complete state transition with STF_SUSPEND
Dec 1 11:54:25 henkle pluto[2072]: | next event EVENT_PENDING_PHASE2 in 47
seconds
Dec 1 11:54:25 henkle pluto[2072]: | helper 0 has work (cnt now 0)
Dec 1 11:54:25 henkle pluto[2072]: read failed with -1: Connection reset by
peer
Dec 1 11:54:25 henkle pluto[2072]: closing helper(0) pid=7976 fd=6 exit=0
I found that adding "nhelpers=0" the to ipsec.conf file "fixed" the problem, so
I
dug a little deeper. I built a debug PLUTO from sources and ran
GDB on the helper:
########################################################################
# gdb //usr/lib/ipsec/pluto
GNU gdb 6.3.90_20051119-debian
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "sparc-linux-gnu"...Using host libthread_db library
"/lib/libthread_db.so.1".
(gdb) attach 30319
Attaching to program: /usr/lib/ipsec/pluto, process 30319
Loaded symbols for /usr/lib/libgmp.so.3
Loaded symbols for /lib/libresolv.so.2
Loaded symbols for /lib/libc.so.6
Loaded symbols for /lib/ld-linux.so.2
0x70163780 in read () from /lib/libc.so.6
(gdb) where
#0 0x70163780 in read () from /lib/libc.so.6
#1 0x00064158 in pluto_crypto_helper (fd=7, helpernum=0) at pluto_crypt.c:163
#2 0x00065aa4 in init_crypto_helper (w=0x112428, n=0) at pluto_crypt.c:664
#3 0x00065d34 in init_crypto_helpers (nhelpers=1) at pluto_crypt.c:745
#4 0x0002ad94 in main (argc=11, argv=0xeff358d4) at plutomain.c:758
(gdb) cont
Continuing.
##### [ Then from another machine I tried setting up an L2TP/IPsec sess]
Program received signal SIGBUS, Bus error.
0x00063fc0 in pluto_crypto_helper (fd=7, helpernum=0) at pluto_crypt.c:168
168 restlen = r->pcr_len-sizeof(r->pcr_len);
(gdb) where
#0 0x00063fc0 in pluto_crypto_helper (fd=7, helpernum=0) at pluto_crypt.c:168
#1 0x00065aa4 in init_crypto_helper (w=0x112428, n=0) at pluto_crypt.c:664
#2 0x00065d34 in init_crypto_helpers (nhelpers=1) at pluto_crypt.c:745
#3 0x0002ad94 in main (argc=11, argv=0xeff358d4) at plutomain.c:758
(gdb) list
163 while(read(fd, reqbuf, sizeof(r->pcr_len)) == sizeof(r->pcr_len)) {
164 int restlen;
165 int actlen;
166
167 r = (struct pluto_crypto_req *)reqbuf;
168 restlen = r->pcr_len-sizeof(r->pcr_len);
169
170 passert(restlen < (signed)PCR_REQ_SIZE);
171
172 /* okay, got a basic size, read the rest of it */
(gdb) print reqbuf
$1 = "\000\000\nl", '\0' <repeats 2673 times>
(gdb) print & reqbuf
$2 = (char (*)[2678]) 0xeff34bae
(gdb) quit
The structure member pcr_len is a size_t (long) and is the first member
of the structure. The address 0xeff34bae is NOT kosher on a SPARC for a
long; it needs to be aligned on a long word boundry to not cause
alignment problems. This problem happens all the time when people port
stuff from x86 to SPARC machines. GDB will read it fine, but the
program will fault trying to use the pointer.
One possible solution, that should be safe for all architectures, is to
force the alignment of the 'reqbuf' to be the same as the alignment of
the first structure member. Since this structure is only used for
communication within the same machine, no other alignment issues will
arise. The following patch does the trick for me:
*** pluto_crypt.c.orig Tue Jul 12 22:14:08 2005
--- pluto_crypt.c Thu Dec 1 14:51:06 2005
***************
*** 146,152 ****
void pluto_crypto_helper(int fd, int helpernum)
{
! char reqbuf[PCR_REQ_SIZE];
struct pluto_crypto_req *r;
signal(SIGHUP, catchhup);
--- 146,152 ----
void pluto_crypto_helper(int fd, int helpernum)
{
! long reqbuf[PCR_REQ_SIZE/sizeof(long)];
struct pluto_crypto_req *r;
signal(SIGHUP, catchhup);
***************
*** 160,166 ****
, helpernum, fd));
memset(reqbuf, 0, sizeof(reqbuf));
! while(read(fd, reqbuf, sizeof(r->pcr_len)) == sizeof(r->pcr_len)) {
int restlen;
int actlen;
--- 160,166 ----
, helpernum, fd));
memset(reqbuf, 0, sizeof(reqbuf));
! while(read(fd, (char*)reqbuf, sizeof(r->pcr_len)) == sizeof(r->pcr_len)) {
int restlen;
int actlen;
***************
*** 170,176 ****
passert(restlen < (signed)PCR_REQ_SIZE);
/* okay, got a basic size, read the rest of it */
! if((actlen= read(fd, reqbuf+sizeof(r->pcr_len), restlen)) != restlen) {
/* faulty read. die, parent will restart us */
loglog(RC_LOG_SERIOUS, "cryptographic helper(%d) read(%d)=%d
failed: %s\n",
--- 170,176 ----
passert(restlen < (signed)PCR_REQ_SIZE);
/* okay, got a basic size, read the rest of it */
! if((actlen= read(fd, ((char *)reqbuf)+sizeof(r->pcr_len), restlen)) !=
restlen) {
/* faulty read. die, parent will restart us */
loglog(RC_LOG_SERIOUS, "cryptographic helper(%d) read(%d)=%d
failed: %s\n",
***************
*** 470,476 ****
*/
void handle_helper_comm(struct pluto_crypto_worker *w)
{
! char reqbuf[PCR_REQ_SIZE];
struct pluto_crypto_req *r;
int restlen;
int actlen;
--- 470,476 ----
*/
void handle_helper_comm(struct pluto_crypto_worker *w)
{
! long reqbuf[PCR_REQ_SIZE/sizeof(long)];
struct pluto_crypto_req *r;
int restlen;
int actlen;
***************
*** 484,490 ****
,w->pcw_work));
/* read from the pipe */
! actlen = read(w->pcw_pipe, reqbuf, sizeof(r->pcr_len));
if(actlen != sizeof(r->pcr_len)) {
if(actlen != 0) {
--- 484,490 ----
,w->pcw_work));
/* read from the pipe */
! actlen = read(w->pcw_pipe, (char *)reqbuf, sizeof(r->pcr_len));
if(actlen != sizeof(r->pcr_len)) {
if(actlen != 0) {
***************
*** 517,523 ****
/* okay, got a basic size, read the rest of it */
if((actlen= read(w->pcw_pipe
! , reqbuf+sizeof(r->pcr_len)
, restlen)) != restlen) {
/* faulty read. die, parent will restart us */
--- 517,523 ----
/* okay, got a basic size, read the rest of it */
if((actlen= read(w->pcw_pipe
! , ((char*)reqbuf)+sizeof(r->pcr_len)
, restlen)) != restlen) {
/* faulty read. die, parent will restart us */
Note that this bug exists in two places, though I observed it
only in pluto_crypto_helper. I went ahead and fixed both
functions identically.
-- System Information:
Debian Release: 3.1
APT prefers testing
APT policy: (990, 'testing'), (80, 'stable'), (50, 'unstable')
Architecture: sparc (sparc64)
Shell: /bin/sh linked to /bin/bash
Kernel: Linux 2.6.14-g741b2252
Locale: LANG=C, LC_CTYPE=C (charmap=ANSI_X3.4-1968)
Versions of packages openswan depends on:
ii bind9-host [host] 1:9.3.1-2 Version of 'host' bundled with BIN
ii bsdmainutils 6.1.2 collection of more utilities from
ii debconf [debconf-2.0] 1.4.58 Debian configuration management sy
ii debianutils 2.14.3 Miscellaneous utilities specific t
ii iproute 20041019-3 Professional tools to control the
ii ipsec-tools 1:0.6.2-2 IPsec tools for Linux
ii libc6 2.3.5-6 GNU C Library: Shared libraries an
ii libcurl3 7.15.0-4 Multi-protocol file transfer libra
ii libgmp3c2 4.1.4-10 Multiprecision arithmetic library
ii libldap2 2.1.30-12 OpenLDAP libraries
ii libpam0g 0.76-23 Pluggable Authentication Modules l
ii libssl0.9.8 0.9.8a-3 SSL shared libraries
ii makedev 2.3.1-78 creates device files in /dev
ii openssl 0.9.7g-2 Secure Socket Layer (SSL) binary a
openswan recommends no packages.
-- debconf information excluded
--- End Message ---