Re: [HACKERS] beta3 Solaris 7 (SPARC) port report [ Was: Lookingfor . . . ]

2001-01-26 Thread Pete Forman

Peter Eisentraut writes:
  Frank Joerdens writes:
  
I have experienced before that Unix sockets will cause random
connection abortions on Solaris [ . . . ]
  
   Isn't that _really_ bad? Random connection abortions when going
   over Unix sockets?? My app does _all_ the connecting over Unix
   sockets?!
  
  That's bad, for sure.  Maybe you can check for odd conditions
  surrounding the /tmp directory, like is it on NFS, permission
  problems, mount options.  Or is there something odd in the kernel
  configuration?  If I'm counting correctly this is the third
  independent report of this problem, which is scary.

I'm not sure if you counted me.  I also observed that Unix sockets
cause the parallel tests to fail in random places on Solaris.


We had a similar problem porting a product that uses a lot of IPC to
Solaris.  There were failures involving the overloading of the Unix
domain sockets.  We took the code to Sun and they were unable to
resolve the problems.  It should have been possible to tune the kernel
to provide more resources.  However it turns out that some of the
parameters that we wanted to tune were ignored in favour of hard coded
values.  In the end we rewrote our code to use Internet domain sockets
(AF_INET).



BTW, owing to a DNS error email to me has bounced over the last couple
of days.  It should be okay now if anything needs to be resent.
-- 
Pete Forman -./\.- Disclaimer: This post is originated
WesternGeco   -./\.-  by myself and does not represent
[EMAIL PROTECTED] -./\.-  opinion of Schlumberger, Baker
http://www.crosswinds.net/~petef  -./\.-  Hughes or their divisions.



Re: [HACKERS] beta3 Solaris 7 (SPARC) port report [ Was: Lookingfor . . . ]

2001-01-25 Thread bpalmer

Worked fine for me...

% uname -a

SunOS lancelot 5.7 Generic_106541-14 sun4m sparc SUNW,SPARCstation-4

% ls -l

-rw-r--r--   1 bpalmer  staff32860160 Jan 23 16:45
postgresql-snapshot.tar

...
...
...
 transactions ... ok
 random   ... failed (ignored)
 portals  ... ok
...
...
...

==
 75 of 76 tests passed, 1 failed test(s) ignored.
==



On Thu, 25 Jan 2001, Peter Eisentraut wrote:

 Frank Joerdens writes:

 [randomly varying set of regression tests fail]

  Running the tests on my Linux box gives no failed tests. Must I assume
  that those failed tests indicate some issue that is is detrimental to
  the proper functioning of the server on this Solaris installation? Do
  you want the regression.diffs?

 Could you go into src/test/regress/pg_regress.sh and edit around line 162

 #case $host_platform in
 #*-*-qnx* | *beos*)
 unix_sockets=no;;
 #*)
 #unix_sockets=yes;;
 #esac

 (i.e., ensure that unix_sockets is set to 'no'), and rerun 'make check'.

 I have experienced before that Unix sockets will cause random connection
 abortions on Solaris, which will cause the regression tests to fail
 arbitrarily.

  I also tried using the Sun compiler, which didn't work at all.

 details on "didn't work" requested...

  now I get scary stuff like:
 
  --- begin scary stuff ---
  test int2 ... ERROR:  pg_atoi: error in "34.5": can't
  parse ".5"
  ERROR:  pg_atoi: error reading "10": Result too large
  ERROR:  pg_atoi: error in "asdf": can't parse "asdf"

 This is normal.  The regression tests sometimes involve intentional
 invalid input.

 --
 Peter Eisentraut  [EMAIL PROTECTED]   http://yi.org/peter-e/





b. palmer,  [EMAIL PROTECTED]
pgp:  www.crimelabs.net/bpalmer.pgp5





Re: [HACKERS] beta3 Solaris 7 (SPARC) port report [ Was: Lookingfor . . . ]

2001-01-25 Thread Peter Eisentraut

Frank Joerdens writes:

  That's bad, for sure.  Maybe you can check for odd conditions surrounding
  the /tmp directory, like is it on NFS, permission problems, mount options.

 I don't have neither root nor physical access to this machine, hence my
 options are kinda limited.

Entering 'mount' should tell you.

 I'll question the sysadmin about that. But why does make installcheck
 work? Because it goes over TCP/IP sockets by default?

No.  Presumably because it does not run more than one test in parallel.

 "pg_backup_null.c", line 90: controlling expressions must have scalar type
 cc: acomp failed for pg_backup_null.c

Line 90 has a comment in my copy.

-- 
Peter Eisentraut  [EMAIL PROTECTED]   http://yi.org/peter-e/




Re: [HACKERS] beta3 Solaris 7 (SPARC) port report [ Was: Lookingfor . . . ]

2001-01-24 Thread Peter Eisentraut

Frank Joerdens writes:

[randomly varying set of regression tests fail]

 Running the tests on my Linux box gives no failed tests. Must I assume
 that those failed tests indicate some issue that is is detrimental to
 the proper functioning of the server on this Solaris installation? Do
 you want the regression.diffs?

Could you go into src/test/regress/pg_regress.sh and edit around line 162

#case $host_platform in
#*-*-qnx* | *beos*)
unix_sockets=no;;
#*)
#unix_sockets=yes;;
#esac

(i.e., ensure that unix_sockets is set to 'no'), and rerun 'make check'.

I have experienced before that Unix sockets will cause random connection
abortions on Solaris, which will cause the regression tests to fail
arbitrarily.

 I also tried using the Sun compiler, which didn't work at all.

details on "didn't work" requested...

 now I get scary stuff like:

 --- begin scary stuff ---
 test int2 ... ERROR:  pg_atoi: error in "34.5": can't
 parse ".5"
 ERROR:  pg_atoi: error reading "10": Result too large
 ERROR:  pg_atoi: error in "asdf": can't parse "asdf"

This is normal.  The regression tests sometimes involve intentional
invalid input.

-- 
Peter Eisentraut  [EMAIL PROTECTED]   http://yi.org/peter-e/