Re: [PacketFence-users] RESOLVED: Upgrading PF 6.5 to 7.0 haproxy not starting

2017-06-02 Thread Louis Munro via PacketFence-users
Hi Ian,


> On Jun 2, 2017, at 13:58, Ian MacDonald via PacketFence-users 
>  wrote:
> 
> This was very helpful and immediately brought us to conclude it was related 
> to a change in our certs, that we opportunistically pushed out,  as a root 
> cause of our issue.  Is there a place in the docs that describes how to get 
> these debug outputs, to better help us help ourselves in the future?

Unfortunately, each daemon has it's own ideas as to what constitutes "debug 
mode" and how to trigger it.
Some are much better at it (e.g. FreeRADIUS) than others.

I had to learn by trial and error myself, at least for those services where 
Inverse did not write the code (e.g. Apache).

In general, you can have a look at systemd to see what is the actual executable 
that is run for a service.
You can get that information from the "ExecStart" line in the systemctl output.
From there it's mostly a trip to the manpage for it, trying to run it with 
--help, or reading the source if it comes to that.

Some examples: 

FreeRADIUS has excellent debugging features, which are well documented.
man radiusd (or man freeradius on debian) shows for example the -X and -C flags 
which can be used to check for syntax or run the server in debug mode.
Additionally you can use "raddebug" to debug a live server without restarting 
and even filter requests so that only the ones matching a condition will 
trigger the debug mode.
Kudos to the FreeRADIUS team.

Apache has a -X mode, as indicated in man httpd (on CentOS, which I have in 
front of me at the moment).

ISC DHCPd has  -f and -d switches to force a process to stay in the foreground 
and log to STDERR.

man winbindd shows switches for --foreground, --stdout, and --debuglevel.

In addition, the log level of most PacketFence services can be configured 
through the conf/log.conf and conf/log.conf.d/* files.
Changing the loglevel from INFO to DEBUG can be helpful, but would not have 
helped in your case since the service was not even starting.


> 
> The actual issue was that even though the cert, key and intermediate were 
> concatenated together into the .pem file, in the right order, one of the 
> files had different LF/CR formatting (windows vs linux), something introduced 
> by our ca, that was not obvious, and did not affect applying the same files 
> to the configuration GUI (nor any other system using the same wildcard 
> certs). 
> 
> On a note related to upgrade in general,  our team saw the release for 7.1, 
> which we are excited about with the inclusion of Ubiquiti devices, and I had 
> some comments back on the upgrade process that might help clarify things for 
> other users upgrading and using the UPGRADE.asciidoc as a reference.  We 
> think it would be worthwhile to tell people to explicitly execute the Version 
> specific steps prior to the Distribution specific steps.  Some justification 
> follows. 
> 
> We knew from our v6.5 to 7.0 upgrade that the section for "Upgrading from a 
> version prior to 7.1.0" had to be executed before the section for "Debian 
> based systems" because it would not make sense to not upgrade the MariaDB 
> first.   For anyone who started on v7.0.1 or later and who might 
> appropriately skip the "Upgrading from a version prior to 7.0.0" section, it 
> really is not clear which group of steps you should execute first -> i.e. 
> Should the user perform the Distribution specific steps before the Version 
> specific steps or vice-versa.It does hint in the doc that 'some steps may 
> be required to be done BEFORE the packages upgrades'  but it never really 
> says clearly 'Go do all the Version specific steps further down the document 
> before you come back up and do your distribution-specific steps'.   Anyone 
> that reads it all, and just executes in order, would (we think) be doing it 
> in the incorrect order.
> 



Fair points, all of them.

We'll try to do better and be more explicit in the future.

Best regards,
--
Louis Munro
lmu...@inverse.ca   ::  www.inverse.ca 
 
+1.514.447.4918 x125  :: +1 (866) 353-6153 x125
Inverse inc. :: Leaders behind SOGo (www.sogo.nu ) and 
PacketFence (www.packetfence.org )--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
PacketFence-users mailing list
PacketFence-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/packetfence-users


[PacketFence-users] RESOLVED: Upgrading PF 6.5 to 7.0 haproxy not starting

2017-06-02 Thread Ian MacDonald via PacketFence-users
Louis,

Thanks for your input on the issue,  some responses to your request for
info below,


>> The main problem seems to be that that the haproxy service is not
starting.

>> In the syslog we just get a generic service failure with no details
>>
>> May 29 16:51:08 pf2 systemd[1]: Started PacketFence HAProxy Load
Balancer.
>> May 29 16:51:08 pf2 systemd[1]: packetfence-haproxy.service: main
process
>> exited, code=exited, status=1/FAILURE
>> May 29 16:51:08 pf2 systemd[1]: Unit packetfence-haproxy.service entered
>> failed state.
>> May 29 16:51:08 pf2 systemd[1]: packetfence-haproxy.service holdoff time
>> over, scheduling restart.
>> May 29 16:51:08 pf2 systemd[1]: Stopping PacketFence HAProxy Load
Balancer...
>>

>Let's try a few things.
>
>First, can you please post the output to these commands:
>
># systemctl status packetfence-haproxy

pf2:~# systemctl status packetfence-haproxy -l
* packetfence-haproxy.service - PacketFence HAProxy Load Balancer
   Loaded: loaded (/lib/systemd/system/packetfence-haproxy.service; enabled)
   Active: failed (Result: start-limit) since Mon 2017-05-29 16:51:15 EDT;
3 days ago
  Process: 1031 ExecStart=/usr/sbin/haproxy-systemd-wrapper -f
/usr/local/pf/var/conf/haproxy.conf -p /usr/local/pf/var/run/haproxy.pid
(code=exited, status=1/FAILURE)
  Process: 977 ExecStartPre=/usr/local/pf/bin/pfcmd service haproxy
generateconfig (code=exited, status=0/SUCCESS)
 Main PID: 1031 (code=exited, status=1/FAILURE)

May 29 16:51:15 pf2 systemd[1]: packetfence-haproxy.service: main process
exited, code=exited, status=1/FAILURE
May 29 16:51:15 pf2 systemd[1]: Unit packetfence-haproxy.service entered
failed state.
May 29 16:51:15 pf2 haproxy-systemd-wrapper[1031]: haproxy-systemd-wrapper:
exit, haproxy RC=1
May 29 16:51:15 pf2 systemd[1]: packetfence-haproxy.service holdoff time
over, scheduling restart.
May 29 16:51:15 pf2 systemd[1]: Stopping PacketFence HAProxy Load
Balancer...
May 29 16:51:15 pf2 systemd[1]: Starting PacketFence HAProxy Load
Balancer...
May 29 16:51:15 pf2 systemd[1]: packetfence-haproxy.service start request
repeated too quickly, refusing to start.
May 29 16:51:15 pf2 systemd[1]: Failed to start PacketFence HAProxy Load
Balancer.
May 29 16:51:15 pf2 systemd[1]: Unit packetfence-haproxy.service entered
failed state.

># systemctl cat packetfence-haproxy

pf2:~# systemctl cat packetfence-haproxy
# /lib/systemd/system/packetfence-haproxy.service
[Unit]
Description=PacketFence HAProxy Load Balancer
Before=packetfence-httpd.portal.service packetfence-httpd.admin.service
Wants=packetfence-config.service

[Service]
StartLimitBurst=3
StartLimitInterval=60
PIDFile=/usr/local/pf/var/run/haproxy.pid
ExecStartPre=/usr/local/pf/bin/pfcmd service haproxy generateconfig
ExecStart=/usr/sbin/haproxy-systemd-wrapper -f
/usr/local/pf/var/conf/haproxy.conf -p /usr/local/pf/var/run/haproxy.pid
ExecReload=/bin/kill -USR2 $MAINPID
Restart=on-failure

[Install]
WantedBy=packetfence-base.target


> # ps -ef | grep haproxy

pf2:~# ps -ef | grep haproxy
root 11820 11782  0 10:16 pts/000:00:00 grep haproxy


>As to the configuration itself, look in
/usr/local/pf/var/conf/haproxy.conf to
>see the configuration that is actually generated by the conf/haproxy.conf
>template.

We did peek in here and nothing jumped out at us.

>You can try running haproxy in debug mode to see what error messages may
be
>lurking there:
>
># /usr/sbin/haproxy -f /usr/local/pf/var/conf/haproxy.conf -p
>/usr/local/pf/var/run/haproxy.pid -d

This was very helpful and immediately brought us to conclude it was related
to a change in our certs, that we opportunistically pushed out,  as a root
cause of our issue.  Is there a place in the docs that describes how to get
these debug outputs, to better help us help ourselves in the future?

pf2:~# /usr/sbin/haproxy -f /usr/local/pf/var/conf/haproxy.conf -p
/usr/local/pf/var/run/haproxy.pid -d
[ALERT] 152/125205 (13132) : parsing
[/usr/local/pf/var/conf/haproxy.conf:110] : 'bind 10.4.2.2:443' : unable to
load SSL private key from PEM file '/usr/local/pf/conf/ssl/server.pem'.
[ALERT] 152/125205 (13132) : parsing
[/usr/local/pf/var/conf/haproxy.conf:156] : 'bind 10.4.3.2:443' :
'/usr/local/pf/conf/ssl/server.pem'.
[ALERT] 152/125205 (13132) : parsing
[/usr/local/pf/var/conf/haproxy.conf:202] : 'bind 10.4.1.2:443' : unable to
load SSL private key from PEM file '/usr/local/pf/conf/ssl/server.pem'.
[ALERT] 152/125205 (13132) : Error(s) found in configuration file :
/usr/local/pf/var/conf/haproxy.conf
[WARNING] 152/125205 (13132) : Proxy 'stats': in multi-process mode, stats
will be limited to process assigned to the current request.
[ALERT] 152/125205 (13132) : Proxy 'portal-https-10.4.2.2': no SSL
certificate specified for bind '10.4.2.2:443' at
[/usr/local/pf/var/conf/haproxy.conf:110] (use 'crt').
[ALERT] 152/125205 (13132) : Proxy 'portal-https-10.4.3.2': no SSL
certificate specified for bind '10.4.3.2:443' at
[/usr/local/pf/var/conf/haproxy.conf:156] (use 'crt').