[PATCH] ISSUES: expand on mail archive info + subscription disclaimer

2017-03-21 Thread Eric Wong
Tis better to pull than push, or something like that.
---
 ISSUES | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/ISSUES b/ISSUES
index 291441a..2857c63 100644
--- a/ISSUES
+++ b/ISSUES
@@ -68,10 +68,11 @@ document distributed with git) on guidelines for patch 
submission.
 * nntp://news.gmane.org/gmane.comp.lang.ruby.unicorn.general
 * nntp://news.public-inbox.org/inbox.comp.lang.ruby.unicorn
 * https://bogomips.org/unicorn-public/
+* http://ou63pmih66umazou.onion/unicorn-public/
 
 Mailing list subscription is optional, so Cc: all participants.
 
-You can follow along via NNTP:
+You can follow along via NNTP (read-only):
 
nntp://news.public-inbox.org/inbox.comp.lang.ruby.unicorn
nntp://news.gmane.org/gmane.comp.lang.ruby.unicorn.general
@@ -79,6 +80,7 @@ You can follow along via NNTP:
 Or Atom feeds:
 
https://bogomips.org/unicorn-public/new.atom
+   http://ou63pmih66umazou.onion/unicorn-public/new.atom
 
The HTML archives at https://bogomips.org/unicorn-public/
also has links to per-thread Atom feeds and downloadable
@@ -88,3 +90,6 @@ You may optionally subscribe via plain-text email:
 
mailto:unicorn-public+subscr...@bogomips.org
(and confirming the auto-reply)
+
+Just keep in mind we suck at delivering email, so using NNTP,
+or Atom feeds might be a better bet...
-- 
BOFH
--
unsubscribe: unicorn-public+unsubscr...@bogomips.org
archive: https://bogomips.org/unicorn-public/



Re: mail queue breakage :x

2017-03-21 Thread Eric Wong
Eric Wong  wrote:
> Anyways, feel free to exit/run it yourself, with or without tor
> and add mirrors and whatnot.  And you can hammer the hell out

Just be warned that attached validation script checks against
deliveries to mail-archive.com; so no, don't hammer the
mail-archive.com servers; I don't know how they set their server
up...

But yeah, anything running off bogomips.org is designed to
handle being on the frontpage of major news aggregators and
then some.
--
unsubscribe: unicorn-public+unsubscr...@bogomips.org
archive: https://bogomips.org/unicorn-public/



Re: mail queue breakage :x

2017-03-21 Thread Eric Wong
Maybe the attached script will help detect mirror mismatches
in the future (run via cron).  But of course the original
mail failure was due to the crontab of the mlmmj user
getting clobbered :x

Anyways, feel free to exit/run it yourself, with or without tor
and add mirrors and whatnot.  And you can hammer the hell out
of https://bogomips.org/unicorn-public/new.atom or
http://ou63pmih66umazou.onion/unicorn-public/new.atom or
nntp://news.public-inbox.org/

(it doesn't run unicorn or nginx :P)


bogomips-mail-check.rb
Description: application/ruby


mail queue breakage :x

2017-03-21 Thread Eric Wong
Apologies for the flood of mail, apparently there was a lot of
stuff backed up for a month which was getting archived but not
delivered.

Anyways NNTP and HTTPS archives have been working, and I forgot
SMTP was still supposed to go out :x

nntp://news.public-inbox.org/inbox.comp.lang.ruby.unicorn
https://bogomips.org/unicorn-public/
--
unsubscribe: unicorn-public+unsubscr...@bogomips.org
archive: https://bogomips.org/unicorn-public/



Re: check_client_connection using getsockopt(2)

2017-03-21 Thread Eric Wong
Simon Eskildsen  wrote:
> On Wed, Feb 22, 2017 at 8:42 PM, Eric Wong  wrote:
> > Simon Eskildsen  wrote:
> >> I meant to ask, in Raindrops why do you use the netlink API to get the
> >> socket backlog instead of `getsockopt(2)` with `TCP_INFO` to get
> >> `tcpi_unacked`? (as described in
> >> http://www.ryanfrantz.com/posts/apache-tcp-backlog/) We use this to
> >> monitor socket backlogs with a sidekick Ruby daemon. Although we're
> >> looking to replace it with a middleware to simplify for Kubernetes.
> >> It's one of our main metrics for monitoring performance, especially
> >> around deploys.
> >
> > The netlink API allows independently-spawned processes to
> > monitor others; so it can be system-wide.  TCP_INFO requires the
> > process doing the checking to also have the socket open.
> >
> > I guess this reasoning for using netlink is invalid for containers,
> > though...
> 
> If you namespace the network it's problematic, yeah. I'm considering
> right now putting Raindrops in a middleware with the netlink API
> inside the container, but it feels weird. That said, if you consider
> the alternative of using `getsockopt(2)` on the listening socket, I
> don't know how you'd get access to the Unicorn listening socket from a
> middleware. Would it be nuts to expose a hook in Unicorn that allows
> periodic execution for monitoring listening stats from Raindrops on
> the listening socket? It seems somewhat of a narrow use-case, but on
> the other hand I'm also not a fan of doing
> `Raindrops::Linux.tcp_listener_stats("localhost:#{ENV["PORT"}")`, but
> that might be less ugly.

Yeah, all options get pretty ugly since Rack doesn't expose that
in a standardized way.  Unicorn::HttpServer::LISTENERS is a
historical constant which stores all listeners used by some
parts of raindrops.

Maybe relying on ObjectSpace or walking the LISTEN_FDS env from
systemd is acceptable for other servers *shrug*


Another way might be to rely on tcpi_last_data_recv in struct
tcp_info from each and every client socket.  That should tell
you when a client stopped writing to the socket, which works for
most HTTP requests.  You could use the same getsockopt() call
you'd use for check_client_connection to read that field.

However, this won't be visible until the client is accept()ed.

raindrops actually has some support for this, but it was
ugly, too (hooking into *accept* methods):
https://bogomips.org/raindrops/Raindrops/LastDataRecv.html

Perhaps refinements could be used, nowadays (I've never used
them).  Just throwing options out there...


In any case, what I definitely do not want is to introduced more
specialized configuration or API which might lock people into
unicorn or having to burden people with versioning incompatibilies.

> > (*) So I've been wondering if adding a "unicorn-mode" to an
> > existing C10K, slow-client-capable Ruby/Rack server +
> > reverse proxy makes sense for containerized deploys.
> > Maybe...
> 
> I'd love to hear more about this idea. What are you contemplating?

Maybe another time, and not on the unicorn list.  I don't feel
comfortable writing about unrelated/tangential projects on the
unicorn list, even if I'm the leader of both projects.  Anyways,
this other server is also in the rack README.md and I've been
announcing it on ruby-talk since 2013.
--
unsubscribe: unicorn-public+unsubscr...@bogomips.org
archive: https://bogomips.org/unicorn-public/



Re: [PATCH] check_client_connection: use tcp state on linux

2017-03-21 Thread Simon Eskildsen
I've put ccc-tcp-v3 in production today--it appears to be working as
expected, still rejecting at the exact same rate as before (100-200
per second for us during steady-state).

On Wed, Mar 8, 2017 at 7:06 AM, Simon Eskildsen
 wrote:
> Patch-set looks great Eric, thanks!
>
> I'm hoping to test this in production later this week or next, but
> we're a year behind on Unicorn (ugh), so will need to carefully roll
> that out.
>
> On Tue, Mar 7, 2017 at 7:26 PM, Eric Wong  wrote:
>> Eric Wong  wrote:
>>> Simon Eskildsen  wrote:
>>> > @@ -28,6 +29,7 @@ class Unicorn::HttpParser
>>> ># Drop these frozen strings when Ruby 2.2 becomes more prevalent,
>>> ># 2.2+ optimizes hash assignments when used with literal string keys
>>> >HTTP_RESPONSE_START = [ 'HTTP', '/1.1 ']
>>> > +  EMPTY_ARRAY = [].freeze
>>>
>>> (minor) this was made before commit e06b699683738f22
>>> ("http_request: freeze constant strings passed IO#write")
>>> but I've trivially fixed it up, locally.
>>
>> And I actually screwed it up, pushed out ccc-tcp-v2 branch
>> with that fix squashed in :x
>>
>> Writing tests, now...



Re: Patch: Add after_worker_exit configuration option

2017-03-21 Thread Jeremy Evans
On 02/21 07:43, Eric Wong wrote:
> Jeremy Evans  wrote:
> > This option can be used to implement custom behavior for handling
> > worker exits.  For example, let's say you have a specific request
> > that crashes a worker process, which you expect to be due to a
> > improperly programmed C extension. By modifying your worker to
> > save request related data in a temporary file and using this option,
> > you can get a record of what request is crashing the application,
> > which will make debugging easier.
> > 
> > This is not a complete patch as it doesn't include tests, but
> > before writing tests I wanted to see if this is something you'd
> > consider including in unicorn.
> 
> What advantage does this have over Ruby's builtin at_exit?

at_exit is not called if the interpreter crashes:

ruby -e 'at_exit{File.write('a.txt', 'a')}; Process.kill :SEGV, $$' 2>/dev/null
([ -f a.txt ] && echo at_exit called) || echo at_exit not called

> 
> AFAIK, at_exit may be registered inside after_fork hooks to the
> same effect.
> 
> Thanks.
--
unsubscribe: unicorn-public+unsubscr...@bogomips.org
archive: https://bogomips.org/unicorn-public/



[PATCH] doc: fix links to raindrops project

2017-03-21 Thread Eric Wong
bogomips.org is dropping prefixes to reduce subjectAltName bloat
in TLS certificates.
---
 Links  | 2 +-
 lib/unicorn.rb | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/Links b/Links
index 7c113c8..475a6c0 100644
--- a/Links
+++ b/Links
@@ -23,7 +23,7 @@ or services behind them.
 * {golden_brindle}[https://github.com/simonoff/golden_brindle] - tool to
   manage multiple unicorn instances/applications on a single server
 
-* {raindrops}[http://raindrops.bogomips.org/] - real-time stats for
+* {raindrops}[https://bogomips.org/raindrops/] - real-time stats for
   preforking Rack servers
 
 * {UnXF}[https://bogomips.org/unxf/]  Un-X-Forward* the Rack environment,
diff --git a/lib/unicorn.rb b/lib/unicorn.rb
index f122563..4bd7bda 100644
--- a/lib/unicorn.rb
+++ b/lib/unicorn.rb
@@ -95,7 +95,7 @@ def self.builder(ru, op)
 
   # returns an array of strings representing TCP listen socket addresses
   # and Unix domain socket paths.  This is useful for use with
-  # Raindrops::Middleware under Linux: http://raindrops.bogomips.org/
+  # Raindrops::Middleware under Linux: https://bogomips.org/raindrops/
   def self.listener_names
 Unicorn::HttpServer::LISTENERS.map do |io|
   Unicorn::SocketHelper.sock_name(io)
-- 
EW

--
unsubscribe: unicorn-public+unsubscr...@bogomips.org
archive: https://bogomips.org/unicorn-public/



Re: [PATCH 0/3] TCP_INFO check_client_connection followups

2017-03-21 Thread Simon Eskildsen
This looks great Eric. Thanks for taking this to the finish line!

On Wed, Mar 8, 2017 at 1:02 AM, Eric Wong  wrote:
> This series goes on top of Simon's excellent work to get
> a TCP_INFO-based check_client_connection working under Linux.
>
> First off, add a test extracted from one of Simon's messages;
> then revert the signature changes to existing methods to
> avoid breaking tmm1/gctools, and finally add a more portable
> fallback for Ruby 2.2+ users (tested on FreeBSD).
>
> Further work will improve portability of raindrops for FreeBSD
> (and maybe other *BSDs incidentally, too).  That will be sent to
> raindrops-public@bogomips => https://bogomips.org/raindrops-public/
>
> Eric Wong (3):
>   new test for check_client_connection
>   revert signature change to HttpServer#process_client
>   support "struct tcp_info" on non-Linux and Ruby 2.2+
>
>  lib/unicorn/http_request.rb  | 42 +++
>  lib/unicorn/http_server.rb   |  6 ++--
>  lib/unicorn/oob_gc.rb|  4 +--
>  lib/unicorn/socket_helper.rb | 16 +++--
>  test/unit/test_ccc.rb| 81 
> 
>  test/unit/test_request.rb| 28 +++
>  6 files changed, 150 insertions(+), 27 deletions(-)
>  create mode 100644 test/unit/test_ccc.rb
>
> Also pushed to the ccc-tcp-v3 branch:
>
> The following changes since commit 73e1ce827faad59bfcaff0bc758c8255a5e4f747:
>
>   t0011-active-unix-socket.sh: fix race condition in test (2017-03-01 
> 00:24:04 +)
>
> are available in the git repository at:
>
>   git://bogomips.org/unicorn ccc-tcp-v3
>
> for you to fetch changes up to 873218e317773462be2a72556d26dc4a723cc7be:
>
>   support "struct tcp_info" on non-Linux and Ruby 2.2+ (2017-03-08 05:39:55 
> +)
>
> 
> Eric Wong (3):
>   new test for check_client_connection
>   revert signature change to HttpServer#process_client
>   support "struct tcp_info" on non-Linux and Ruby 2.2+
>
> Simon Eskildsen (1):
>   check_client_connection: use tcp state on linux
>
>  lib/unicorn/http_request.rb  | 73 ---
>  lib/unicorn/socket_helper.rb | 16 +++--
>  test/unit/test_ccc.rb| 81 
> 
>  3 files changed, 163 insertions(+), 7 deletions(-)
>  create mode 100644 test/unit/test_ccc.rb
>
> --
> EW



Re: Patch: Add after_worker_exit configuration option

2017-03-21 Thread Eric Wong
Jeremy Evans  wrote:
> On 02/21 07:43, Eric Wong wrote:
> > Jeremy Evans  wrote:
> > > This option can be used to implement custom behavior for handling
> > > worker exits.  For example, let's say you have a specific request
> > > that crashes a worker process, which you expect to be due to a
> > > improperly programmed C extension. By modifying your worker to
> > > save request related data in a temporary file and using this option,
> > > you can get a record of what request is crashing the application,
> > > which will make debugging easier.
> > > 
> > > This is not a complete patch as it doesn't include tests, but
> > > before writing tests I wanted to see if this is something you'd
> > > consider including in unicorn.
> > 
> > What advantage does this have over Ruby's builtin at_exit?
> 
> at_exit is not called if the interpreter crashes:
> 
> ruby -e 'at_exit{File.write('a.txt', 'a')}; Process.kill :SEGV, $$' 
> 2>/dev/null
> ([ -f a.txt ] && echo at_exit called) || echo at_exit not called

Ah, thanks.  I didn't read the code carefully, enough.
The commit message and documentation should reflect that it's
called in the master and not the worker.

Anyways, this is probably OK since I can't think of a less
intrusive or more generic (across all Rack servers) way of doing
it.  So I'm inclined to accept some version of this.

> --- a/lib/unicorn/http_server.rb
> +++ b/lib/unicorn/http_server.rb
> @@ -14,7 +14,8 @@ class Unicorn::HttpServer
>attr_accessor :app, :timeout, :worker_processes,
>  :before_fork, :after_fork, :before_exec,
>  :listener_opts, :preload_app,
> -:orig_app, :config, :ready_pipe, :user
> +:orig_app, :config, :ready_pipe, :user,
> +:after_worker_exit

I've been trying to reduce the method entry overhead across
the board.  Since this is a new field, can it be attr_writer
solely for configurator?

I don't think there's reader dependency other than below...

> @@ -395,8 +396,7 @@ def reap_all_workers
>  proc_name 'master'
>else
>  worker = @workers.delete(wpid) and worker.close rescue nil
> -m = "reaped #{status.inspect} worker=#{worker.nr rescue 'unknown'}"
> -status.success? ? logger.info(m) : logger.error(m)
> +after_worker_exit.call(self, worker, status)

Which can be made to access the ivar directly:

   @after_worker_exit.call(self, worker, status)
--
unsubscribe: unicorn-public+unsubscr...@bogomips.org
archive: https://bogomips.org/unicorn-public/



[PATCH 2/3] revert signature change to HttpServer#process_client

2017-03-21 Thread Eric Wong
We can force kgio_tryaccept to return an internal class
for TCP objects by subclassing Kgio::TCPServer.

This avoids breakage in any unfortunate projects which depend on
our undocumented internal APIs, such as gctools


Cc: Aman Gupta 
---
 lib/unicorn/http_request.rb  | 10 +-
 lib/unicorn/http_server.rb   |  6 +++---
 lib/unicorn/oob_gc.rb|  4 ++--
 lib/unicorn/socket_helper.rb | 16 ++--
 test/unit/test_request.rb| 28 ++--
 5 files changed, 38 insertions(+), 26 deletions(-)

diff --git a/lib/unicorn/http_request.rb b/lib/unicorn/http_request.rb
index 9acde50..68bde16 100644
--- a/lib/unicorn/http_request.rb
+++ b/lib/unicorn/http_request.rb
@@ -61,7 +61,7 @@ def self.check_client_connection=(bool)
   # returns an environment hash suitable for Rack if successful
   # This does minimal exception trapping and it is up to the caller
   # to handle any socket errors (e.g. user aborted upload).
-  def read(socket, listener)
+  def read(socket)
 clear
 e = env
 
@@ -82,7 +82,7 @@ def read(socket, listener)
   false until add_parse(socket.kgio_read!(16384))
 end
 
-check_client_connection(socket, listener) if @@check_client_connection
+check_client_connection(socket) if @@check_client_connection
 
 e['rack.input'] = 0 == content_length ?
   NULL_IO : @@input_class.new(socket, self)
@@ -105,8 +105,8 @@ def hijacked?
   end
 
   if defined?(Raindrops::TCP_Info)
-def check_client_connection(socket, listener) # :nodoc:
-  if Kgio::TCPServer === listener
+def check_client_connection(socket) # :nodoc:
+  if Unicorn::TCPClient === socket
 @@tcp_info ||= Raindrops::TCP_Info.new(socket)
 @@tcp_info.get!(socket)
 raise Errno::EPIPE, "client closed connection".freeze,
@@ -127,7 +127,7 @@ def closed_state?(state) # :nodoc:
   end
 end
   else
-def check_client_connection(socket, listener) # :nodoc:
+def check_client_connection(socket) # :nodoc:
   write_http_header(socket)
 end
   end
diff --git a/lib/unicorn/http_server.rb b/lib/unicorn/http_server.rb
index 2aa1072..c2086cb 100644
--- a/lib/unicorn/http_server.rb
+++ b/lib/unicorn/http_server.rb
@@ -558,8 +558,8 @@ def e100_response_write(client, env)
 
   # once a client is accepted, it is processed in its entirety here
   # in 3 easy steps: read request, call app, write app response
-  def process_client(client, listener)
-status, headers, body = @app.call(env = @request.read(client, listener))
+  def process_client(client)
+status, headers, body = @app.call(env = @request.read(client))
 
 begin
   return if @request.hijacked?
@@ -655,7 +655,7 @@ def worker_loop(worker)
 # Unicorn::Worker#kgio_tryaccept is not like accept(2) at all,
 # but that will return false
 if client = sock.kgio_tryaccept
-  process_client(client, sock)
+  process_client(client)
   nr += 1
   worker.tick = time_now.to_i
 end
diff --git a/lib/unicorn/oob_gc.rb b/lib/unicorn/oob_gc.rb
index 74a1d51..5572e59 100644
--- a/lib/unicorn/oob_gc.rb
+++ b/lib/unicorn/oob_gc.rb
@@ -67,8 +67,8 @@ def self.new(app, interval = 5, path = %r{\A/})
 
   #:stopdoc:
   PATH_INFO = "PATH_INFO"
-  def process_client(client, listener)
-super(client, listener) # Unicorn::HttpServer#process_client
+  def process_client(client)
+super(client) # Unicorn::HttpServer#process_client
 if OOBGC_PATH =~ OOBGC_ENV[PATH_INFO] && ((@@nr -= 1) <= 0)
   @@nr = OOBGC_INTERVAL
   OOBGC_ENV.clear
diff --git a/lib/unicorn/socket_helper.rb b/lib/unicorn/socket_helper.rb
index df8315e..5371413 100644
--- a/lib/unicorn/socket_helper.rb
+++ b/lib/unicorn/socket_helper.rb
@@ -3,6 +3,18 @@
 require 'socket'
 
 module Unicorn
+
+  # Instead of using a generic Kgio::Socket for everything,
+  # tag TCP sockets so we can use TCP_INFO under Linux without
+  # incurring extra syscalls for Unix domain sockets.
+  # TODO: remove these when we remove kgio
+  TCPClient = Class.new(Kgio::Socket) # :nodoc:
+  class TCPSrv < Kgio::TCPServer # :nodoc:
+def kgio_tryaccept # :nodoc:
+  super(TCPClient)
+end
+  end
+
   module SocketHelper
 
 # internal interface
@@ -148,7 +160,7 @@ def new_tcp_server(addr, port, opt)
   end
   sock.bind(Socket.pack_sockaddr_in(port, addr))
   sock.autoclose = false
-  Kgio::TCPServer.for_fd(sock.fileno)
+  TCPSrv.for_fd(sock.fileno)
 end
 
 # returns rfc2732-style (e.g. "[::1]:666") addresses for IPv6
@@ -185,7 +197,7 @@ def sock_name(sock)
 def server_cast(sock)
   begin
 Socket.unpack_sockaddr_in(sock.getsockname)
-Kgio::TCPServer.for_fd(sock.fileno)
+TCPSrv.for_fd(sock.fileno)
   rescue ArgumentError
 Kgio::UNIXServer.for_fd(sock.fileno)
   end
diff --git a/test/unit/test_request.rb b/test/unit/test_request.rb
index dbe8af7..f0ccaf7 100644

Re: [PATCH] check_client_connection: use tcp state on linux

2017-03-21 Thread Simon Eskildsen
Looks like Puma encountered some issues with some Linux distro's
kernels not supporting this. That's crazy..

https://github.com/puma/puma/issues/1241



[PATCH] test-lib: expr(1) portability fix

2017-03-21 Thread Eric Wong
GNU expr supports '+' to match one or more occurrences, but
it seems the expr(1) on my FreeBSD installation does not.
---
 This only covers the Bourne sh integration tests in t/

 I still don't trust the Ruby language (and test libraries
 written in it) to not change incompatibility after all these years...

 t/test-lib.sh | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/t/test-lib.sh b/t/test-lib.sh
index 28d6a88..7f97958 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -106,8 +106,8 @@ check_stderr () {
 # unicorn_setup
 unicorn_setup () {
eval $(unused_listen)
-   port=$(expr $listen : '[^:]*:\([0-9]\+\)')
-   host=$(expr $listen : '\([^:]*\):[0-9]\+')
+   port=$(expr $listen : '[^:]*:\([0-9]*\)')
+   host=$(expr $listen : '\([^:][^:]*\):[0-9][0-9]*')
 
rtmpfiles unicorn_config pid r_err r_out fifo tmp ok
cat > $unicorn_config 

Patch: Fix code example in after_worker_exit documentation

2017-03-21 Thread Jeremy Evans
Here's a separate commit for the documentation fix.

>From 6efb4b8e07a0459180579b72fac5003e424837c0 Mon Sep 17 00:00:00 2001
From: Jeremy Evans 
Date: Thu, 23 Feb 2017 10:17:19 -0800
Subject: [PATCH] Fix code example in after_worker_exit documentation

---
 lib/unicorn/configurator.rb | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/unicorn/configurator.rb b/lib/unicorn/configurator.rb
index 5bad925..1e2b6e4 100644
--- a/lib/unicorn/configurator.rb
+++ b/lib/unicorn/configurator.rb
@@ -162,7 +162,7 @@ def after_fork(*args, &block)
   # sets after_worker_exit hook to a given block.  This block will be called
   # by the master process after a worker exits:
   #
-  #  after_fork do |server,worker,status|
+  #  after_worker_exit do |server,worker,status|
   ## status is a Process::Status instance for the exited worker process
   #unless status.success?
   #  server.logger.error("worker process failure: #{status.inspect}")
-- 
2.11.0

--
unsubscribe: unicorn-public+unsubscr...@bogomips.org
archive: https://bogomips.org/unicorn-public/



check_client_connection using getsockopt(2)

2017-03-21 Thread Simon Eskildsen
Hello!

Almost five years ago Tom Burns contributed the patch in collaboration
with Eric that introduced the `check_client_connection` configuration
option. To summarize the patch, it was a solution to a problem we have
of rapid refreshes during sales where Unicorn would render a page 5
times if an eager customer refreshed Shopify 5 times, despite only
seeing one-rendering.  This is a large amount of lost capacity. Four
of these connections would effectively be in a `CLOSE` state in the
backlog, get `accept(2)`ed and a response would be sent back only to
get an error that the client had closed its connection.

The patch solved this problem by instead of doing a single `write(2)`,
it would do a write of the generic HTTP version line, then jump into
the middleware stack and render the Rack response in a second write.
If the client had closed, the first `write(2)` of the HTTP version
header would usually throw an exception causing Unicorn to bail before
rendering the heavy Rack response. This saves a large amount of
capacity during spiky traffic.

A subsequent commit after testing by Eric revealed that:

> This only affects clients connecting over Unix domain sockets and TCP via 
> loopback (127...*). It is unlikely to detect disconnects if the client is on 
> a remote host (even on a fast LAN).

Thanks for that testing Eric. If we hadn't stumbled upon this in the
documentation proactively, this wouldn't have been easy to debug in
production.

In my testing, I can confirm Eric's tests. My testing essentially
consists of a snippet like the following to send rapid requests and
then closing the client. I've confirmed with Wireshark this is roughly
how popular browsers behave when refreshing fast on a slowly rendered
page:

100.times do |i|
  client = TCPSocket.new("some-unicorn", 20_000)
  client.write("GET /collections/#{rand(1)}
HTTP/1.1\r\nHost:walrusser.myshopify.com\r\n\r\n")
  client.close
end

This confirms Eric's comment that the existing
`check_client_connection` works perfectly on loopback, but as soon as
you put an actual network between the Unicorn and client—it's only
effective 20% of the time, even with `TCP_NODELAY`. I'm assuming, due
to buffering, even when disabling Nagle's. As we're changing our
architecture, we move from ngx (lb) -> ngx (host) -> unicorn to ngx
(lb) -> unicorn. That means this patch will no longer work for us.

I propose instead of the early `write(2)` hack, that we use
`getsockopt(2)` with the `TCP_INFO` flag and read the state of the
socket. If it's in `CLOSE_WAIT` or `CLOSE`, we kill the connection and
move on to the next.

https://github.com/torvalds/linux/blob/8fa3b6f9392bf6d90cb7b908e07bd90166639f0a/include/uapi/linux/tcp.h#L163

This is not going to be portable, but we can do this on only Linux
which I suspect is where most production deployments of Unicorn that
would benefit from this feature run. It's not useful in development
(which is likely more common to not be Linux). We could also introduce
it under a new configuration option if desired. In my testing, this
works to reject 100% of requests early when not on loopback.

The code is essentially:

def client_closed?
  tcp_info = socket.getsockopt(Socket::SOL_TCP, Socket::TCP_INFO)
  state = tcp_info.unpack("c")[0]
  state == TCP_CLOSE || state == TCP_CLOSE_WAIT
end

This could be called at the top of `#process_client` in `http_server.rb`.

Would there be interest in this upstream? Any comments on this
proposed implementation? Currently, we're using a middleware with the
Rack hijack API, but this seems like it belongs at the webserver
level.



[PATCH] tests: keep disabled tests defined

2017-03-21 Thread Eric Wong
Some versions of test-unit will fail if an unspecified test is
attempted via "-n", so we need to define an empty test.

We cannot use "skip", either, as that seems exclusive to
minitest; and we won't use minitest since it has more
incompatible changes than test-unit over the last 8 years.
---
 test/exec/test_exec.rb  |  7 ---
 test/unit/test_http_parser.rb   | 18 --
 test/unit/test_socket_helper.rb | 12 
 test/unit/test_util.rb  |  4 ++--
 4 files changed, 14 insertions(+), 27 deletions(-)

diff --git a/test/exec/test_exec.rb b/test/exec/test_exec.rb
index ca0b7bc..4941c4e 100644
--- a/test/exec/test_exec.rb
+++ b/test/exec/test_exec.rb
@@ -97,6 +97,9 @@ def teardown
   end
 
   def test_sd_listen_fds_emulation
+# [ruby-core:69895] [Bug #11336] fixed by r51576
+return if RUBY_VERSION.to_f < 2.3
+
 File.open("config.ru", "wb") { |fp| fp.write(HI) }
 sock = TCPServer.new(@addr, @port)
 
@@ -124,9 +127,7 @@ def test_sd_listen_fds_emulation
 end
   ensure
 sock.close if sock
-# disabled test on old Rubies: https://bugs.ruby-lang.org/issues/11336
-# [ruby-core:69895] [Bug #11336] fixed by r51576
-  end if RUBY_VERSION.to_f >= 2.3
+  end
 
   def test_inherit_listener_unspecified
 File.open("config.ru", "wb") { |fp| fp.write(HI) }
diff --git a/test/unit/test_http_parser.rb b/test/unit/test_http_parser.rb
index 7cbc0f8..31e6f71 100644
--- a/test/unit/test_http_parser.rb
+++ b/test/unit/test_http_parser.rb
@@ -851,24 +851,6 @@ def test_empty_header
 assert_equal '', parser.env['HTTP_HOST']
   end
 
-  # so we don't  care about the portability of this test
-  # if it doesn't leak on Linux, it won't leak anywhere else
-  # unless your C compiler or platform is otherwise broken
-  LINUX_PROC_PID_STATUS = "/proc/self/status"
-  def test_memory_leak
-match_rss = /^VmRSS:\s+(\d+)/
-if File.read(LINUX_PROC_PID_STATUS) =~ match_rss
-  before = $1.to_i
-  100.times { Unicorn::HttpParser.new }
-  File.read(LINUX_PROC_PID_STATUS) =~ match_rss
-  after = $1.to_i
-  diff = after - before
-  assert(diff < 1, "memory grew more than 10M: #{diff}")
-end
-  end if RUBY_PLATFORM =~ /linux/ &&
- File.readable?(LINUX_PROC_PID_STATUS) &&
- !defined?(RUBY_ENGINE)
-
   def test_memsize
 require 'objspace'
 if ObjectSpace.respond_to?(:memsize_of)
diff --git a/test/unit/test_socket_helper.rb b/test/unit/test_socket_helper.rb
index 7526e82..8699409 100644
--- a/test/unit/test_socket_helper.rb
+++ b/test/unit/test_socket_helper.rb
@@ -150,28 +150,31 @@ def test_sock_name
   end
 
   def test_tcp_defer_accept_default
+return unless defined?(TCP_DEFER_ACCEPT)
 port = unused_port @test_addr
 name = "#@test_addr:#{port}"
 sock = bind_listen(name)
 cur = sock.getsockopt(Socket::SOL_TCP, TCP_DEFER_ACCEPT).unpack('i')[0]
 assert cur >= 1
-  end if defined?(TCP_DEFER_ACCEPT)
+  end
 
   def test_tcp_defer_accept_disable
+return unless defined?(TCP_DEFER_ACCEPT)
 port = unused_port @test_addr
 name = "#@test_addr:#{port}"
 sock = bind_listen(name, :tcp_defer_accept => false)
 cur = sock.getsockopt(Socket::SOL_TCP, TCP_DEFER_ACCEPT).unpack('i')[0]
 assert_equal 0, cur
-  end if defined?(TCP_DEFER_ACCEPT)
+  end
 
   def test_tcp_defer_accept_nr
+return unless defined?(TCP_DEFER_ACCEPT)
 port = unused_port @test_addr
 name = "#@test_addr:#{port}"
 sock = bind_listen(name, :tcp_defer_accept => 60)
 cur = sock.getsockopt(Socket::SOL_TCP, TCP_DEFER_ACCEPT).unpack('i')[0]
 assert cur > 1
-  end if defined?(TCP_DEFER_ACCEPT)
+  end
 
   def test_ipv6only
 port = begin
@@ -186,6 +189,7 @@ def test_ipv6only
   end
 
   def test_reuseport
+return unless defined?(Socket::SO_REUSEPORT)
 port = unused_port @test_addr
 name = "#@test_addr:#{port}"
 sock = bind_listen(name, :reuseport => true)
@@ -193,5 +197,5 @@ def test_reuseport
 assert_operator cur, :>, 0
   rescue Errno::ENOPROTOOPT
 # kernel does not support SO_REUSEPORT (older Linux)
-  end if defined?(Socket::SO_REUSEPORT)
+  end
 end
diff --git a/test/unit/test_util.rb b/test/unit/test_util.rb
index 4d17a16..dc6302e 100644
--- a/test/unit/test_util.rb
+++ b/test/unit/test_util.rb
@@ -69,7 +69,7 @@ def test_reopen_logs_renamed_with_encoding
   }
 }
 tmp.close!
-  end if STDIN.respond_to?(:external_encoding)
+  end
 
   def test_reopen_logs_renamed_with_internal_encoding
 tmp = Tempfile.new('')
@@ -101,5 +101,5 @@ def test_reopen_logs_renamed_with_internal_encoding
   }
 }
 tmp.close!
-  end if STDIN.respond_to?(:external_encoding)
+  end
 end
-- 
EW

--
unsubscribe: unicorn-public+unsubscr...@bogomips.org
archive: https://bogomips.org/unicorn-public/



Re: Patch: Add after_worker_exit configuration option

2017-03-21 Thread Jeremy Evans
On 02/21 08:15, Eric Wong wrote:
> Jeremy Evans  wrote:
> > On 02/21 07:43, Eric Wong wrote:
> > > Jeremy Evans  wrote:
> > > > This option can be used to implement custom behavior for handling
> > > > worker exits.  For example, let's say you have a specific request
> > > > that crashes a worker process, which you expect to be due to a
> > > > improperly programmed C extension. By modifying your worker to
> > > > save request related data in a temporary file and using this option,
> > > > you can get a record of what request is crashing the application,
> > > > which will make debugging easier.
> > > > 
> > > > This is not a complete patch as it doesn't include tests, but
> > > > before writing tests I wanted to see if this is something you'd
> > > > consider including in unicorn.
> > > 
> > > What advantage does this have over Ruby's builtin at_exit?
> > 
> > at_exit is not called if the interpreter crashes:
> > 
> > ruby -e 'at_exit{File.write('a.txt', 'a')}; Process.kill :SEGV, $$' 
> > 2>/dev/null
> > ([ -f a.txt ] && echo at_exit called) || echo at_exit not called
> 
> Ah, thanks.  I didn't read the code carefully, enough.
> The commit message and documentation should reflect that it's
> called in the master and not the worker.
> 
> Anyways, this is probably OK since I can't think of a less
> intrusive or more generic (across all Rack servers) way of doing
> it.  So I'm inclined to accept some version of this.
> 
> > --- a/lib/unicorn/http_server.rb
> > +++ b/lib/unicorn/http_server.rb
> > @@ -14,7 +14,8 @@ class Unicorn::HttpServer
> >attr_accessor :app, :timeout, :worker_processes,
> >  :before_fork, :after_fork, :before_exec,
> >  :listener_opts, :preload_app,
> > -:orig_app, :config, :ready_pipe, :user
> > +:orig_app, :config, :ready_pipe, :user,
> > +:after_worker_exit
> 
> I've been trying to reduce the method entry overhead across
> the board.  Since this is a new field, can it be attr_writer
> solely for configurator?
> 
> I don't think there's reader dependency other than below...
> 
> > @@ -395,8 +396,7 @@ def reap_all_workers
> >  proc_name 'master'
> >else
> >  worker = @workers.delete(wpid) and worker.close rescue nil
> > -m = "reaped #{status.inspect} worker=#{worker.nr rescue 'unknown'}"
> > -status.success? ? logger.info(m) : logger.error(m)
> > +after_worker_exit.call(self, worker, status)
> 
> Which can be made to access the ivar directly:
> 
>  @after_worker_exit.call(self, worker, status)

Here's a revised patch that should address the issues you identified:

>From 91b95652e4bdb7fc9af77e9ac06ad7400faef796 Mon Sep 17 00:00:00 2001
From: Jeremy Evans 
Date: Fri, 17 Feb 2017 16:12:33 -0800
Subject: [PATCH] Add after_worker_exit configuration option

This option is executed in the master process following all
worker process exits.  It is most useful in the case where
the worker process crashes the ruby interpreter, as the worker
process may not be able to send error notifications
appropriately.

For example, let's say you have a specific request that crashes a
worker process, which you expect to be due to a improperly
programmed C extension. By modifying your worker to save request
related data in a temporary file and using this option, you can get
a record of what request is crashing the application, which will
make debugging easier.

Example:

after_worker_exit do |server, worker, status|
  server.logger.info "worker #{status.success? ? 'exit' : 'crash'}: #{status}"

  file = "request.#{status.pid}.txt"
  if File.exist?(file)
do_something_with(File.read(file)) unless status.success?
File.delete(file)
  end
end
---
 lib/unicorn/configurator.rb | 21 +
 lib/unicorn/http_server.rb  |  4 ++--
 2 files changed, 23 insertions(+), 2 deletions(-)

diff --git a/lib/unicorn/configurator.rb b/lib/unicorn/configurator.rb
index 3329c10..5bad925 100644
--- a/lib/unicorn/configurator.rb
+++ b/lib/unicorn/configurator.rb
@@ -41,6 +41,14 @@ class Unicorn::Configurator
 :before_exec => lambda { |server|
 server.logger.info("forked child re-executing...")
   },
+:after_worker_exit => lambda { |server, worker, status|
+m = "reaped #{status.inspect} worker=#{worker.nr rescue 'unknown'}"
+if status.success?
+  server.logger.info(m)
+else
+  server.logger.error(m)
+end
+  },
 :pid => nil,
 :preload_app => false,
 :check_client_connection => false,
@@ -151,6 +159,19 @@ def after_fork(*args, &block)
 set_hook(:after_fork, block_given? ? block : args[0])
   end
 
+  # sets after_worker_exit hook to a given block.  This block will be called
+  # by the master process after a worker exits:
+  #
+  #  after_fork do |server,worker,status|
+  ## status is a Process::Status instance for the exited worker process
+  #unless status.success?
+  #  server.

[PATCH 3/3] support "struct tcp_info" on non-Linux and Ruby 2.2+

2017-03-21 Thread Eric Wong
Ruby 2.2+ can show "struct tcp_info" as a string via
Socket::Option#inspect, and we can attempt to parse it
out to extract the information we need.

Parsing this string is inefficient, but does not depend on the
ordering of the tcp_info struct.
---
 lib/unicorn/http_request.rb | 32 +++-
 1 file changed, 31 insertions(+), 1 deletion(-)

diff --git a/lib/unicorn/http_request.rb b/lib/unicorn/http_request.rb
index 68bde16..c08097c 100644
--- a/lib/unicorn/http_request.rb
+++ b/lib/unicorn/http_request.rb
@@ -29,6 +29,7 @@ class Unicorn::HttpParser
   EMPTY_ARRAY = [].freeze
   @@input_class = Unicorn::TeeInput
   @@check_client_connection = false
+  @@tcpi_inspect_ok = true
 
   def self.input_class
 @@input_class
@@ -127,8 +128,37 @@ def closed_state?(state) # :nodoc:
   end
 end
   else
+
+# Ruby 2.2+ can show struct tcp_info as a string Socket::Option#inspect.
+# Not that efficient, but probably still better than doing unnecessary
+# work after a client gives up.
 def check_client_connection(socket) # :nodoc:
-  write_http_header(socket)
+  if Unicorn::TCPClient === socket && @@tcpi_inspect_ok
+opt = socket.getsockopt(:IPPROTO_TCP, :TCP_INFO).inspect
+if opt =~ /\bstate=(\S+)/
+  @@tcpi_inspect_ok = true
+  raise Errno::EPIPE, "client closed connection".freeze,
+EMPTY_ARRAY if closed_state_str?($1)
+else
+  @@tcpi_inspect_ok = false
+  write_http_header(socket)
+end
+opt.clear
+  else
+write_http_header(socket)
+  end
+end
+
+def closed_state_str?(state)
+  case state
+  when 'ESTABLISHED'
+false
+  # not a typo, ruby maps TCP_CLOSE (no 'D') to state=CLOSED (w/ 'D')
+  when 'CLOSE_WAIT', 'TIME_WAIT', 'CLOSED', 'LAST_ACK', 'CLOSING'
+true
+  else
+false
+  end
 end
   end
 
-- 
EW

--
unsubscribe: unicorn-public+unsubscr...@bogomips.org
archive: https://bogomips.org/unicorn-public/



Re: [PATCH] Add worker_exec configuration option V2

2017-03-21 Thread Jeremy Evans
On 03/08 08:02, Eric Wong wrote:
> Jeremy Evans  wrote:
> > The worker_exec configuration option makes all worker processes
> > exec after forking.  This initializes the worker processes with
> > separate memory layouts, defeating address space discovery
> > attacks on operating systems supporting address space layout
> > randomization, such as Linux, MacOS X, NetBSD, OpenBSD, and
> > Solaris.
> > 
> > Support for execing workers is very similar to support for reexecing
> > the master process.  The main difference is the worker's to_i and
> > master pipes also need to be inherited after worker exec just as the
> > listening sockets need to be inherited after reexec.
> 
> Thanks, this seems like an acceptable feature.

Eric,

Here's V2 of the patch, which I think should address all of the issues
you pointed out.

Thanks,
Jeremy

>From 8e68bf8c6a8b91704f31dd9b9a62e6f2e330e380 Mon Sep 17 00:00:00 2001
From: Jeremy Evans 
Date: Wed, 8 Mar 2017 10:19:02 -0800
Subject: [PATCH] Add worker_exec configuration option

The worker_exec configuration option makes all worker processes
exec after forking.  This initializes the worker processes with
separate memory layouts, defeating address space discovery
attacks on operating systems supporting address space layout
randomization, such as Linux, MacOS X, NetBSD, OpenBSD, and
Solaris.

Support for execing workers is very similar to support for reexecing
the master process.  The main difference is the worker's to_i and
master pipes also need to be inherited after worker exec just as the
listening sockets need to be inherited after reexec.

Because execing working is similar to reexecing the master, this
extracts a couple of methods from reexec (listener_sockets and
close_sockets_on_exec), so they can be reused in worker_spawn.
---
 lib/unicorn/configurator.rb | 10 ++
 lib/unicorn/http_server.rb  | 83 ++---
 lib/unicorn/worker.rb   |  5 +--
 3 files changed, 77 insertions(+), 21 deletions(-)

diff --git a/lib/unicorn/configurator.rb b/lib/unicorn/configurator.rb
index 7ed5ffa..f69f220 100644
--- a/lib/unicorn/configurator.rb
+++ b/lib/unicorn/configurator.rb
@@ -53,6 +53,7 @@ class Unicorn::Configurator
 server.logger.info("worker=#{worker.nr} ready")
   },
 :pid => nil,
+:worker_exec => false,
 :preload_app => false,
 :check_client_connection => false,
 :rewindable_input => true, # for Rack 2.x: (Rack::VERSION[0] <= 1),
@@ -239,6 +240,15 @@ def timeout(seconds)
 set[:timeout] = seconds > max ? max : seconds
   end
 
+  # Whether to exec in each worker process after forking.  This changes the
+  # memory layout of each worker process, which is a security feature designed
+  # to defeat possible address space discovery attacks.  Note that using
+  # worker_exec only makes sense if you are not preloading the application,
+  # and will result in higher memory usage.
+  def worker_exec(bool)
+set_bool(:worker_exec, bool)
+  end
+
   # sets the current number of worker_processes to +nr+.  Each worker
   # process will serve exactly one client at a time.  You can
   # increment or decrement this value at runtime by sending SIGTTIN
diff --git a/lib/unicorn/http_server.rb b/lib/unicorn/http_server.rb
index ef897ad..a5bd2c4 100644
--- a/lib/unicorn/http_server.rb
+++ b/lib/unicorn/http_server.rb
@@ -15,7 +15,7 @@ class Unicorn::HttpServer
 :before_fork, :after_fork, :before_exec,
 :listener_opts, :preload_app,
 :orig_app, :config, :ready_pipe, :user
-  attr_writer   :after_worker_exit, :after_worker_ready
+  attr_writer   :after_worker_exit, :after_worker_ready, :worker_exec
 
   attr_reader :pid, :logger
   include Unicorn::SocketHelper
@@ -105,6 +105,14 @@ def initialize(app, options = {})
 # list of signals we care about and trap in master.
 @queue_sigs = [
   :WINCH, :QUIT, :INT, :TERM, :USR1, :USR2, :HUP, :TTIN, :TTOU ]
+
+@worker_data = if worker_data = ENV['UNICORN_WORKER']
+  worker_data = worker_data.split(',').map!(&:to_i)
+  worker_data[1] = worker_data.slice!(1..2).map do |i|
+Kgio::Pipe.for_fd(i)
+  end
+  worker_data
+end
   end
 
   # Runs the thing.  Returns self so you can run join on it
@@ -113,7 +121,7 @@ def start
 # this pipe is used to wake us up from select(2) in #join when signals
 # are trapped.  See trap_deferred.
 @self_pipe.replace(Unicorn.pipe)
-@master_pid = $$
+@master_pid = @worker_data ? Process.ppid : $$
 
 # setup signal handlers before writing pid file in case people get
 # trigger happy and send signals as soon as the pid file exists.
@@ -430,11 +438,7 @@ def reexec
 end
 
 @reexec_pid = fork do
-  listener_fds = {}
-  LISTENERS.each do |sock|
-sock.close_on_exec = false
-listener_fds[sock.fileno] = sock
-  end
+  listener_fds = listener_sockets
   ENV['UNICORN_FD'] = listener_fds.keys.join(',')
   

[PATCH 0/3] TCP_INFO check_client_connection followups

2017-03-21 Thread Eric Wong
This series goes on top of Simon's excellent work to get
a TCP_INFO-based check_client_connection working under Linux.

First off, add a test extracted from one of Simon's messages;
then revert the signature changes to existing methods to
avoid breaking tmm1/gctools, and finally add a more portable
fallback for Ruby 2.2+ users (tested on FreeBSD).

Further work will improve portability of raindrops for FreeBSD
(and maybe other *BSDs incidentally, too).  That will be sent to
raindrops-public@bogomips => https://bogomips.org/raindrops-public/

Eric Wong (3):
  new test for check_client_connection
  revert signature change to HttpServer#process_client
  support "struct tcp_info" on non-Linux and Ruby 2.2+

 lib/unicorn/http_request.rb  | 42 +++
 lib/unicorn/http_server.rb   |  6 ++--
 lib/unicorn/oob_gc.rb|  4 +--
 lib/unicorn/socket_helper.rb | 16 +++--
 test/unit/test_ccc.rb| 81 
 test/unit/test_request.rb| 28 +++
 6 files changed, 150 insertions(+), 27 deletions(-)
 create mode 100644 test/unit/test_ccc.rb

Also pushed to the ccc-tcp-v3 branch:

The following changes since commit 73e1ce827faad59bfcaff0bc758c8255a5e4f747:

  t0011-active-unix-socket.sh: fix race condition in test (2017-03-01 00:24:04 
+)

are available in the git repository at:

  git://bogomips.org/unicorn ccc-tcp-v3

for you to fetch changes up to 873218e317773462be2a72556d26dc4a723cc7be:

  support "struct tcp_info" on non-Linux and Ruby 2.2+ (2017-03-08 05:39:55 
+)


Eric Wong (3):
  new test for check_client_connection
  revert signature change to HttpServer#process_client
  support "struct tcp_info" on non-Linux and Ruby 2.2+

Simon Eskildsen (1):
  check_client_connection: use tcp state on linux

 lib/unicorn/http_request.rb  | 73 ---
 lib/unicorn/socket_helper.rb | 16 +++--
 test/unit/test_ccc.rb| 81 
 3 files changed, 163 insertions(+), 7 deletions(-)
 create mode 100644 test/unit/test_ccc.rb

-- 
EW
--
unsubscribe: unicorn-public+unsubscr...@bogomips.org
archive: https://bogomips.org/unicorn-public/



Patch: Add support for chroot to Worker#user V2

2017-03-21 Thread Jeremy Evans
Here's V2 of the chroot support patch.

This changes the commit message language, and supports chrooting to
a directory that isn't the current directory.

>From 9bd82792d57f54a868c9a0e9af2bd452f3ef298d Mon Sep 17 00:00:00 2001
From: Jeremy Evans 
Date: Tue, 21 Feb 2017 08:44:34 -0800
Subject: [PATCH] Add support for chroot to Worker#user

Any chrooting would need to happen inside Worker#user, because
you can't chroot until after you have parsed the list of groups,
and you must chroot before dropping root privileges.

chroot adds an extra layer of security, so that if the unicorn
process is exploited, file system access is limited to the chroot
directory instead of the entire file system.
---
 lib/unicorn/worker.rb | 13 ++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/lib/unicorn/worker.rb b/lib/unicorn/worker.rb
index 6748a2f..e22c1bf 100644
--- a/lib/unicorn/worker.rb
+++ b/lib/unicorn/worker.rb
@@ -111,9 +111,11 @@ def close # :nodoc:
   # In most cases, you should be using the Unicorn::Configurator#user
   # directive instead.  This method should only be used if you need
   # fine-grained control of exactly when you want to change permissions
-  # in your after_fork hooks.
+  # in your after_fork or after_worker_ready hooks, or if you want to
+  # use the chroot support.
   #
-  # Changes the worker process to the specified +user+ and +group+
+  # Changes the worker process to the specified +user+ and +group+,
+  # and chroots to the current working directory if +chroot+ is set.
   # This is only intended to be called from within the worker
   # process from the +after_fork+ hook.  This should be called in
   # the +after_fork+ hook after any privileged functions need to be
@@ -123,7 +125,7 @@ def close # :nodoc:
   # directly back to the caller (usually the +after_fork+ hook.
   # These errors commonly include ArgumentError for specifying an
   # invalid user/group and Errno::EPERM for insufficient privileges
-  def user(user, group = nil)
+  def user(user, group = nil, chroot = false)
 # we do not protect the caller, checking Process.euid == 0 is
 # insufficient because modern systems have fine-grained
 # capabilities.  Let the caller handle any and all errors.
@@ -134,6 +136,11 @@ def user(user, group = nil)
   Process.initgroups(user, gid)
   Process::GID.change_privilege(gid)
 end
+if chroot
+  chroot = Dir.pwd if chroot == true
+  Dir.chroot(chroot)
+  Dir.chdir('/')
+end
 Process.euid != uid and Process::UID.change_privilege(uid)
 @switched = true
   end
-- 
2.11.0

--
unsubscribe: unicorn-public+unsubscr...@bogomips.org
archive: https://bogomips.org/unicorn-public/



[PATCH] http_response: clear header buffer before response body iteration

2017-03-21 Thread Eric Wong
To alleviate some memory pressure during response body
iteration, rely on String#clear to free(3) memory used for the
large header buffer.  Given the size of some response headers
such as cookies, this can release several kilobytes of malloc
heap memory for immediate use by other layers of the application
stack.
---
 lib/unicorn/http_response.rb | 1 +
 1 file changed, 1 insertion(+)

diff --git a/lib/unicorn/http_response.rb b/lib/unicorn/http_response.rb
index ec128e4..284fc68 100644
--- a/lib/unicorn/http_response.rb
+++ b/lib/unicorn/http_response.rb
@@ -49,6 +49,7 @@ def http_response_write(socket, status, headers, body,
 end
   end
   socket.write(buf << "\r\n".freeze)
+  buf.clear
 end
 
 if hijack
-- 
EW

--
unsubscribe: unicorn-public+unsubscr...@bogomips.org
archive: https://bogomips.org/unicorn-public/



Re: [PATCH] check_client_connection: use tcp state on linux

2017-03-21 Thread Simon Eskildsen
I left out a TCPSocket check, we'll need a `socket.is_a?(TCPSocket)`
in there. I'll wait with sending an updated patch in case you have
other comments. I'm also not entirely sure we need the `socket.close`.
What do you think?

On Sat, Feb 25, 2017 at 9:03 AM, Simon Eskildsen
 wrote:
> The implementation of the check_client_connection causing an early write is
> ineffective when not performed on loopback. In my testing, when on 
> non-loopback,
> such as another host, we only see a 10-20% rejection rate with TCP_NODELAY of
> clients that are closed. This means 90-80% of responses in this case are
> rendered in vain.
>
> This patch uses getosockopt(2) with TCP_INFO on Linux to read the socket 
> state.
> If the socket is in a state where it doesn't take a response, we'll abort the
> request with a similar error as to what write(2) would give us on a closed
> socket in case of an error.
>
> In my testing, this has a 100% rejection rate. This testing was conducted with
> the following script:
>
> 100.times do |i|
>   client = TCPSocket.new("some-unicorn", 20_000)
>   client.write("GET /collections/#{rand(1)}
> HTTP/1.1\r\nHost:walrusser.myshopify.com\r\n\r\n")
>   client.close
> end
> ---
>  lib/unicorn/http_request.rb | 19 +++
>  1 file changed, 15 insertions(+), 4 deletions(-)
>
> diff --git a/lib/unicorn/http_request.rb b/lib/unicorn/http_request.rb
> index 0c1f9bb..b4c95fc 100644
> --- a/lib/unicorn/http_request.rb
> +++ b/lib/unicorn/http_request.rb
> @@ -31,6 +31,9 @@ class Unicorn::HttpParser
>@@input_class = Unicorn::TeeInput
>@@check_client_connection = false
>
> +  # TCP_TIME_WAIT: 6, TCP_CLOSE: 7, TCP_CLOSE_WAIT: 8, TCP_LAST_ACK: 9
> +  IGNORED_CHECK_CLIENT_SOCKET_STATES = (6..9)
> +
>def self.input_class
>  @@input_class
>end
> @@ -83,10 +86,18 @@ def read(socket)
>false until add_parse(socket.kgio_read!(16384))
>  end
>
> -# detect if the socket is valid by writing a partial response:
> -if @@check_client_connection && headers?
> -  self.response_start_sent = true
> -  HTTP_RESPONSE_START.each { |c| socket.write(c) }
> +# detect if the socket is valid by checking client connection.
> +if @@check_client_connection
> +  if defined?(Raindrops::TCP_Info)
> +tcp_info = Raindrops::TCP_Info.new(socket)
> +if IGNORED_CHECK_CLIENT_SOCKET_STATES.cover?(tcp_info.state)
> +  socket.close
> +  raise Errno::EPIPE
> +end
> +  elsif headers?
> +self.response_start_sent = true
> +HTTP_RESPONSE_START.each { |c| socket.write(c) }
> +  end
>  end
>
>  e['rack.input'] = 0 == content_length ?
> --
> 2.11.0



[PATCH] freebsd: avoid EINVAL when setting accept filter

2017-03-21 Thread Eric Wong
Accept filters can only be set on listen sockets, and it also
fails with EINVAL if it's already set.

Untested, but I suppose changing the accept filter on a listening
socket is not supported, either; since that could affect in-flight
sockets.
---
 lib/unicorn/socket_helper.rb | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/lib/unicorn/socket_helper.rb b/lib/unicorn/socket_helper.rb
index df8315e..7aa2bb0 100644
--- a/lib/unicorn/socket_helper.rb
+++ b/lib/unicorn/socket_helper.rb
@@ -63,12 +63,15 @@ def set_tcp_sockopt(sock, opt)
   elsif respond_to?(:accf_arg)
 name = opt[:accept_filter]
 name = DEFAULTS[:accept_filter] if name.nil?
+sock.listen(opt[:backlog])
+got = (sock.getsockopt(:SOL_SOCKET, :SO_ACCEPTFILTER) rescue nil).to_s
+arg = accf_arg(name)
 begin
-  sock.setsockopt(:SOL_SOCKET, :SO_ACCEPTFILTER, accf_arg(name))
+  sock.setsockopt(:SOL_SOCKET, :SO_ACCEPTFILTER, arg)
 rescue => e
   logger.error("#{sock_name(sock)} " \
"failed to set accept_filter=#{name} (#{e.inspect})")
-end
+end if arg != got
   end
 end
 
-- 
EW

--
unsubscribe: unicorn-public+unsubscr...@bogomips.org
archive: https://bogomips.org/unicorn-public/



Re: [PATCH] check_client_connection: use tcp state on linux

2017-03-21 Thread Simon Eskildsen
Patch-set looks great Eric, thanks!

I'm hoping to test this in production later this week or next, but
we're a year behind on Unicorn (ugh), so will need to carefully roll
that out.

On Tue, Mar 7, 2017 at 7:26 PM, Eric Wong  wrote:
> Eric Wong  wrote:
>> Simon Eskildsen  wrote:
>> > @@ -28,6 +29,7 @@ class Unicorn::HttpParser
>> ># Drop these frozen strings when Ruby 2.2 becomes more prevalent,
>> ># 2.2+ optimizes hash assignments when used with literal string keys
>> >HTTP_RESPONSE_START = [ 'HTTP', '/1.1 ']
>> > +  EMPTY_ARRAY = [].freeze
>>
>> (minor) this was made before commit e06b699683738f22
>> ("http_request: freeze constant strings passed IO#write")
>> but I've trivially fixed it up, locally.
>
> And I actually screwed it up, pushed out ccc-tcp-v2 branch
> with that fix squashed in :x
>
> Writing tests, now...



Re: [PATCH (ccc)] http_request: support proposed Raindrops::TCP states on

2017-03-21 Thread Jeremy Evans
On 03/21 02:55, Eric Wong wrote:
> TCP states names/numbers seem stable for each OS, but differ in
> name and numeric values between them.  So I started upon this
> patch series for raindrops last week:
> 
>   https://bogomips.org/raindrops-public/20170316031652.17433-...@80x24.org/T/
> 
> And things seem to be alright, for the most part.  Anyways
> here's a proposed patch for unicorn which takes advantage of
> the above (proposed) changes for raindrops and will allow
> unicorn to support check_client_connection
> 
> Also Cc:-ing Jeremy to see if he has any input on the OpenBSD
> side of things.

OpenBSD seems to support the constants you are using in raindrops:

#define TCPS_CLOSED 0   /* closed */
#define TCPS_LISTEN 1   /* listening for connection */
#define TCPS_SYN_SENT   2   /* active, have sent syn */
#define TCPS_SYN_RECEIVED   3   /* have sent and received syn */
#define TCPS_ESTABLISHED4   /* established */
#define TCPS_CLOSE_WAIT 5   /* rcvd fin, waiting for close */
#define TCPS_FIN_WAIT_1 6   /* have closed, sent fin */
#define TCPS_CLOSING7   /* closed xchd FIN; await ACK */
#define TCPS_LAST_ACK   8   /* had fin and close; await FIN ACK */
#define TCPS_FIN_WAIT_2 9   /* have closed, fin is acked */
#define TCPS_TIME_WAIT  10  /* in 2*msl quiet wait after close */

I'm fine with dropping raindrops as a unicorn dependency and making it
optional if it isn't necessary for correct operation.

Thanks,
Jeremy
--
unsubscribe: unicorn-public+unsubscr...@bogomips.org
archive: https://bogomips.org/unicorn-public/



[PATCH ccc-tcp-v3] http_request: reduce insn size for check_client_connection

2017-03-21 Thread Eric Wong
Unlike constants and instance variables, class variable access
is not optimized in the mainline Ruby VM.  Use a constant
instead, to take advantage of inline constant caching.

This further reduces runtime instruction size by avoiding a
branch by allocating the Raindrops::TCP_Info object up front.

This reduces the method size by roughly 300 bytes on 64-bit.
---
  Also pushed to git://bogomips.org/unicorn ccc-tcp-v3

 lib/unicorn/http_request.rb | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/lib/unicorn/http_request.rb b/lib/unicorn/http_request.rb
index c08097c..9010007 100644
--- a/lib/unicorn/http_request.rb
+++ b/lib/unicorn/http_request.rb
@@ -106,12 +106,13 @@ def hijacked?
   end
 
   if defined?(Raindrops::TCP_Info)
+TCPI = Raindrops::TCP_Info.allocate
+
 def check_client_connection(socket) # :nodoc:
   if Unicorn::TCPClient === socket
-@@tcp_info ||= Raindrops::TCP_Info.new(socket)
-@@tcp_info.get!(socket)
+# Raindrops::TCP_Info#get!, #state (reads struct tcp_info#tcpi_state)
 raise Errno::EPIPE, "client closed connection".freeze,
-  EMPTY_ARRAY if closed_state?(@@tcp_info.state)
+  EMPTY_ARRAY if closed_state?(TCPI.get!(socket).state)
   else
 write_http_header(socket)
   end
-- 
EW
--
unsubscribe: unicorn-public+unsubscr...@bogomips.org
archive: https://bogomips.org/unicorn-public/



Re: Rack::Request#params EOFError

2017-03-21 Thread Eric Wong
John Smart  wrote:
> I found an interesting bug today.  When sending a form multipart >
> 112KB, I noticed that Rack::Request#params was empty. This only occurs
> when using Unicorn, and does not occur when using Thin.
> 
> I dug into Rack source and noticed that any EOFError raised while
> reading the body input stream will cause the post body input stream to
> be ignored:
> 
> https://github.com/rack/rack/blob/cabaa58fe6ac355623746e287475af88c9395d66/lib/rack/request.rb#L357

EOFError should not happen with normal operations supported by
env['rack.input'] (read, gets, each, rewind).

So I'm not sure why rack does not let EOFError propagate up the
stack...

The equivalent Ruby methods (IO#{read,gets,each,rewind})
do not raise EOFError; read and gets should return nil on EOF...

> When multipart > 112KB, I noticed that Unicorn tees the input stream
> to a temporary file.  I was wondering: might Unicorn::TeeInput raise
> an EOFError as part of normal operation when reaching the end of the
> input stream?  If so, this would break the Rack spec.  I only tested
> this on Darwin, still working on a self-contained repro.

No, it should not raise EOFError unless a client sent less than
the Content-Length it declared in the header, or if it sent a
short chunk with "Transfer-Encoding: chunked".

EOFError should be raised to break out of the application
processing entirely if and only if the client decides to disconnect
prematurely.  This is needed to allow unicorn to move onto other
clients.

What unicorn could (and maybe should) do is raise a different
error which is not a subclass of EOFError; to prevent the error
from being caught by Rack (or any other middlewares).

What client are you using?

Is it sending "Transfer-Encoding: chunked" or a Content-Length?

Is nginx in front of unicorn?  If not, does it happen when nginx
is in front of unicorn?

> I ended up raising client_body_buffer_size to 1MB as a work-around.
> I'm wondering if this is a bug?

Not sure, yet.  unicorn isn't meant to handle unreliable
clients; it's designed to talk to nginx.

Thanks.
--
unsubscribe: unicorn-public+unsubscr...@bogomips.org
archive: https://bogomips.org/unicorn-public/



Patch: Add after_worker_exit configuration option

2017-03-21 Thread Jeremy Evans
This option can be used to implement custom behavior for handling
worker exits.  For example, let's say you have a specific request
that crashes a worker process, which you expect to be due to a
improperly programmed C extension. By modifying your worker to
save request related data in a temporary file and using this option,
you can get a record of what request is crashing the application,
which will make debugging easier.

This is not a complete patch as it doesn't include tests, but
before writing tests I wanted to see if this is something you'd
consider including in unicorn.

Example:

after_worker_exit do |server, worker, status|
  server.logger.info "worker #{status.success? ? 'exit' : 'crash'}: #{status}"

  file = "request.#{status.pid}.txt"
  if File.exist?(file)
do_something_with(File.read(file)) unless status.success?
File.delete(file)
  end
end
---
 lib/unicorn/configurator.rb | 14 ++
 lib/unicorn/http_server.rb  |  6 +++---
 2 files changed, 17 insertions(+), 3 deletions(-)

diff --git a/lib/unicorn/configurator.rb b/lib/unicorn/configurator.rb
index 3329c10..81589b0 100644
--- a/lib/unicorn/configurator.rb
+++ b/lib/unicorn/configurator.rb
@@ -41,6 +41,14 @@ class Unicorn::Configurator
 :before_exec => lambda { |server|
 server.logger.info("forked child re-executing...")
   },
+:after_worker_exit => lambda { |server, worker, status|
+m = "reaped #{status.inspect} worker=#{worker.nr rescue 'unknown'}"
+if status.success?
+  server.logger.info(m)
+else
+  server.logger.error(m)
+end
+  },
 :pid => nil,
 :preload_app => false,
 :check_client_connection => false,
@@ -151,6 +159,12 @@ def after_fork(*args, &block)
 set_hook(:after_fork, block_given? ? block : args[0])
   end
 
+  # sets after_worker_exit hook to a given block.  This block will be called
+  # by the master process after a worker exits.
+  def after_worker_exit(*args, &block)
+set_hook(:after_worker_exit, block_given? ? block : args[0], 3)
+  end
+
   # sets before_fork got be a given Proc object.  This Proc
   # object will be called by the master process before forking
   # each worker.
diff --git a/lib/unicorn/http_server.rb b/lib/unicorn/http_server.rb
index 35bd100..567ee0e 100644
--- a/lib/unicorn/http_server.rb
+++ b/lib/unicorn/http_server.rb
@@ -14,7 +14,8 @@ class Unicorn::HttpServer
   attr_accessor :app, :timeout, :worker_processes,
 :before_fork, :after_fork, :before_exec,
 :listener_opts, :preload_app,
-:orig_app, :config, :ready_pipe, :user
+:orig_app, :config, :ready_pipe, :user,
+:after_worker_exit
 
   attr_reader :pid, :logger
   include Unicorn::SocketHelper
@@ -395,8 +396,7 @@ def reap_all_workers
 proc_name 'master'
   else
 worker = @workers.delete(wpid) and worker.close rescue nil
-m = "reaped #{status.inspect} worker=#{worker.nr rescue 'unknown'}"
-status.success? ? logger.info(m) : logger.error(m)
+after_worker_exit.call(self, worker, status)
   end
 rescue Errno::ECHILD
   break
-- 
2.11.0

--
unsubscribe: unicorn-public+unsubscr...@bogomips.org
archive: https://bogomips.org/unicorn-public/



[PATCH] doc: add version annotations for new features

2017-03-21 Thread Eric Wong
I suppose this is a good idea, too.

Will merge before the 5.3.0 RCs and release (soonish, I think...)

---8<
Subject: [PATCH] doc: add version annotations for new features

We will inevitably have people running old unicorn versions
for many years to come; but they may be reading the latest
documentation online.

Annotate when the new features (will) appear to avoid misleading
users on old versions.
---
 lib/unicorn/configurator.rb | 2 ++
 lib/unicorn/worker.rb   | 5 -
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/lib/unicorn/configurator.rb b/lib/unicorn/configurator.rb
index 7ed5ffa..3eb8c22 100644
--- a/lib/unicorn/configurator.rb
+++ b/lib/unicorn/configurator.rb
@@ -186,6 +186,8 @@ def after_worker_exit(*args, &block)
   #
   # Do not use Configurator#user if you rely on changing users in the
   # after_worker_ready hook.
+  #
+  # after_worker_ready is only available in unicorn 5.3.0+
   def after_worker_ready(*args, &block)
 set_hook(:after_worker_ready, block_given? ? block : args[0])
   end
diff --git a/lib/unicorn/worker.rb b/lib/unicorn/worker.rb
index e22c1bf..2f5b6a6 100644
--- a/lib/unicorn/worker.rb
+++ b/lib/unicorn/worker.rb
@@ -124,7 +124,10 @@ def close # :nodoc:
   # Any and all errors raised within this method will be propagated
   # directly back to the caller (usually the +after_fork+ hook.
   # These errors commonly include ArgumentError for specifying an
-  # invalid user/group and Errno::EPERM for insufficient privileges
+  # invalid user/group and Errno::EPERM for insufficient privileges.
+  #
+  # chroot support is only available in unicorn 5.3.0+
+  # user and group switching appeared in unicorn 0.94.0 (2009-11-05)
   def user(user, group = nil, chroot = false)
 # we do not protect the caller, checking Process.euid == 0 is
 # insufficient because modern systems have fine-grained
-- 
EW
--
unsubscribe: unicorn-public+unsubscr...@bogomips.org
archive: https://bogomips.org/unicorn-public/



Re: [PATCH (ccc)] http_request: support proposed Raindrops::TCP states on

2017-03-21 Thread Eric Wong
Jeremy Evans  wrote:
> OpenBSD seems to support the constants you are using in raindrops:

Thanks; I expected the commonality with FreeBSD; and I expect
NetBSD and DragonflyBSD to be identical, too.

> I'm fine with dropping raindrops as a unicorn dependency and making it
> optional if it isn't necessary for correct operation.

Alright, it might be in a further-off release, though.
(unless you or somebody wants to accelerate it's removal)
--
unsubscribe: unicorn-public+unsubscr...@bogomips.org
archive: https://bogomips.org/unicorn-public/



Re: [PATCH] check_client_connection: use tcp state on linux

2017-03-21 Thread Eric Wong
Simon Eskildsen  wrote:
> Looks like Puma encountered some issues with some Linux distro's
> kernels not supporting this. That's crazy..
> 
> https://github.com/puma/puma/issues/1241

No, it's not a Linux kernel problem.

That looks like a Unix socket they're using, but they only use
TCPServer.for_fd in their import_from_env function, and
inherit_unix_listener leaves TCPServer objects as-is.

TCPServer.for_fd won't barf if given a Unix socket.
In other words, this is fine:

 u = UNIXServer.new '/tmp/foo'
 t = TCPServer.for_fd(u.fileno)
 t.accept

Feel free to forward this to them, I do not use non-Free
messaging systems (and I hate centralized ones).
--
unsubscribe: unicorn-public+unsubscr...@bogomips.org
archive: https://bogomips.org/unicorn-public/



[PATCH] Add worker_exec configuration option

2017-03-21 Thread Jeremy Evans
The worker_exec configuration option makes all worker processes
exec after forking.  This initializes the worker processes with
separate memory layouts, defeating address space discovery
attacks on operating systems supporting address space layout
randomization, such as Linux, MacOS X, NetBSD, OpenBSD, and
Solaris.

Support for execing workers is very similar to support for reexecing
the master process.  The main difference is the worker's to_i and
master pipes also need to be inherited after worker exec just as the
listening sockets need to be inherited after reexec.
---
 lib/unicorn/configurator.rb | 10 
 lib/unicorn/http_server.rb  | 56 ++---
 lib/unicorn/worker.rb   |  5 ++--
 3 files changed, 66 insertions(+), 5 deletions(-)

diff --git a/lib/unicorn/configurator.rb b/lib/unicorn/configurator.rb
index 7ed5ffa..f69f220 100644
--- a/lib/unicorn/configurator.rb
+++ b/lib/unicorn/configurator.rb
@@ -53,6 +53,7 @@ class Unicorn::Configurator
 server.logger.info("worker=#{worker.nr} ready")
   },
 :pid => nil,
+:worker_exec => false,
 :preload_app => false,
 :check_client_connection => false,
 :rewindable_input => true, # for Rack 2.x: (Rack::VERSION[0] <= 1),
@@ -239,6 +240,15 @@ def timeout(seconds)
 set[:timeout] = seconds > max ? max : seconds
   end
 
+  # Whether to exec in each worker process after forking.  This changes the
+  # memory layout of each worker process, which is a security feature designed
+  # to defeat possible address space discovery attacks.  Note that using
+  # worker_exec only makes sense if you are not preloading the application,
+  # and will result in higher memory usage.
+  def worker_exec(bool)
+set_bool(:worker_exec, bool)
+  end
+
   # sets the current number of worker_processes to +nr+.  Each worker
   # process will serve exactly one client at a time.  You can
   # increment or decrement this value at runtime by sending SIGTTIN
diff --git a/lib/unicorn/http_server.rb b/lib/unicorn/http_server.rb
index ef897ad..d68d7d8 100644
--- a/lib/unicorn/http_server.rb
+++ b/lib/unicorn/http_server.rb
@@ -15,7 +15,7 @@ class Unicorn::HttpServer
 :before_fork, :after_fork, :before_exec,
 :listener_opts, :preload_app,
 :orig_app, :config, :ready_pipe, :user
-  attr_writer   :after_worker_exit, :after_worker_ready
+  attr_writer   :after_worker_exit, :after_worker_ready, :worker_exec
 
   attr_reader :pid, :logger
   include Unicorn::SocketHelper
@@ -105,6 +105,14 @@ def initialize(app, options = {})
 # list of signals we care about and trap in master.
 @queue_sigs = [
   :WINCH, :QUIT, :INT, :TERM, :USR1, :USR2, :HUP, :TTIN, :TTOU ]
+
+if worker_nr = ENV['UNICORN_WORKER_NR']
+  @worker_nr = worker_nr.to_i
+  @master_pid = Process.ppid
+else
+  @master_pid = $$
+end
+
   end
 
   # Runs the thing.  Returns self so you can run join on it
@@ -113,7 +121,6 @@ def start
 # this pipe is used to wake us up from select(2) in #join when signals
 # are trapped.  See trap_deferred.
 @self_pipe.replace(Unicorn.pipe)
-@master_pid = $$
 
 # setup signal handlers before writing pid file in case people get
 # trigger happy and send signals as soon as the pid file exists.
@@ -459,6 +466,39 @@ def reexec
 proc_name 'master (old)'
   end
 
+  def worker_exec(worker)
+listener_fds = {}
+LISTENERS.each do |sock|
+  sock.close_on_exec = false
+  listener_fds[sock.fileno] = sock
+end
+ENV['UNICORN_FD'] = listener_fds.keys.join(',')
+ENV['UNICORN_WORKER_NR'] = worker.nr.to_s
+ENV['UNICORN_WORKER_TO_IO'] = worker.to_io.fileno.to_s
+ENV['UNICORN_WORKER_MASTER'] = worker.master.fileno.to_s
+Dir.chdir(START_CTX[:cwd])
+cmd = [ START_CTX[0] ].concat(START_CTX[:argv])
+
+listener_fds[worker.to_io.fileno] = worker.to_io
+listener_fds[worker.master.fileno] = worker.master
+
+# avoid leaking FDs we don't know about, but let before_exec
+# unset FD_CLOEXEC, if anything else in the app eventually
+# relies on FD inheritence.
+(3..1024).each do |io|
+  next if listener_fds.include?(io)
+  io = IO.for_fd(io) rescue next
+  io.autoclose = false
+  io.close_on_exec = true
+end
+
+# exec(command, hash) works in at least 1.9.1+, but will only be
+# required in 1.9.4/2.0.0 at earliest.
+cmd << listener_fds
+logger.info "executing worker #{cmd.inspect} (in #{Dir.pwd})"
+exec(*cmd)
+  end
+
   # forcibly terminate all workers that haven't checked in in timeout seconds. 
 The timeout is implemented using an unlinked File
   def murder_lazy_workers
 next_sleep = @timeout - 1
@@ -495,6 +535,12 @@ def after_fork_internal
   end
 
   def spawn_missing_workers
+if @worker_nr
+  worker = Unicorn::Worker.new(@worker_nr, [ENV["UNICORN_WORKER_TO_IO"], 
ENV["UNICORN_WORKER_MASTER"]].map{|i| Kgio::Pipe.for_fd(i.to_i)})
+ 

[PATCH 1/3] new test for check_client_connection

2017-03-21 Thread Eric Wong
This was a bit tricky to test, but it's probably more reliable
now that we're relying on TCP_INFO.

Based on test by Simon Eskildsen :
  
https://bogomips.org/unicorn-public/CAO3HKM49+aLD=klij3zhjqkwnr7bcwvan0movxd85xfrw8r...@mail.gmail.com/
---
 test/unit/test_ccc.rb | 81 +++
 1 file changed, 81 insertions(+)
 create mode 100644 test/unit/test_ccc.rb

diff --git a/test/unit/test_ccc.rb b/test/unit/test_ccc.rb
new file mode 100644
index 000..22b1a9c
--- /dev/null
+++ b/test/unit/test_ccc.rb
@@ -0,0 +1,81 @@
+require 'socket'
+require 'unicorn'
+require 'io/wait'
+require 'tempfile'
+require 'test/unit'
+
+class TestCccTCPI < Test::Unit::TestCase
+  def test_ccc_tcpi
+start_pid = $$
+host = '127.0.0.1'
+srv = TCPServer.new(host, 0)
+port = srv.addr[1]
+err = Tempfile.new('unicorn_ccc')
+rd, wr = IO.pipe
+pid = fork do
+  reqs = 0
+  rd.close
+  worker_pid = nil
+  app = lambda do |env|
+worker_pid ||= begin
+  at_exit { wr.write(reqs.to_s) if worker_pid == $$ }
+  $$
+end
+reqs += 1
+sleep(1) if env['PATH_INFO'] == '/sleep'
+[ 200, [ %w(Content-Length 0),  %w(Content-Type text/plain) ], [] ]
+  end
+  ENV['UNICORN_FD'] = srv.fileno.to_s
+  opts = {
+listeners: [ "#{host}:#{port}" ],
+stderr_path: err.path,
+check_client_connection: true,
+  }
+  uni = Unicorn::HttpServer.new(app, opts)
+  uni.start.join
+end
+wr.close
+
+# make sure the server is running, at least
+client = TCPSocket.new(host, port)
+client.write("GET / HTTP/1.1\r\nHost: example.com\r\n\r\n")
+assert client.wait_readable(10), 'never got response from server'
+res = client.read
+assert_match %r{\AHTTP/1\.1 200}, res, 'got part of first response'
+assert_match %r{\r\n\r\n\z}, res, 'got end of response, server is ready'
+client.close
+
+# start a slow request...
+sleeper = TCPSocket.new(host, port)
+sleeper.write("GET /sleep HTTP/1.1\r\nHost: example.com\r\n\r\n")
+
+# and a bunch of aborted ones
+nr = 100
+nr.times do |i|
+  client = TCPSocket.new(host, port)
+  client.write("GET /collections/#{rand(1)} HTTP/1.1\r\n" \
+   "Host: example.com\r\n\r\n")
+  client.close
+end
+sleeper.close
+kpid = pid
+pid = nil
+Process.kill(:QUIT, kpid)
+_, status = Process.waitpid2(kpid)
+assert status.success?
+reqs = rd.read.to_i
+warn "server got #{reqs} requests with #{nr} CCC aborted\n" if $DEBUG
+assert_operator reqs, :<, nr
+assert_operator reqs, :>=, 2, 'first 2 requests got through, at least'
+  ensure
+return if start_pid != $$
+srv.close if srv
+if pid
+  Process.kill(:QUIT, pid)
+  _, status = Process.waitpid2(pid)
+  assert status.success?
+end
+err.close! if err
+rd.close if rd
+  end
+end
-- 
EW

--
unsubscribe: unicorn-public+unsubscr...@bogomips.org
archive: https://bogomips.org/unicorn-public/



Re: [PATCH] Add worker_exec configuration option

2017-03-21 Thread Eric Wong
Jeremy Evans  wrote:
> The worker_exec configuration option makes all worker processes
> exec after forking.  This initializes the worker processes with
> separate memory layouts, defeating address space discovery
> attacks on operating systems supporting address space layout
> randomization, such as Linux, MacOS X, NetBSD, OpenBSD, and
> Solaris.
> 
> Support for execing workers is very similar to support for reexecing
> the master process.  The main difference is the worker's to_i and
> master pipes also need to be inherited after worker exec just as the
> listening sockets need to be inherited after reexec.

Thanks, this seems like an acceptable feature.
Some comments below

> +++ b/lib/unicorn/http_server.rb
> @@ -105,6 +105,14 @@ def initialize(app, options = {})
>  # list of signals we care about and trap in master.
>  @queue_sigs = [
>:WINCH, :QUIT, :INT, :TERM, :USR1, :USR2, :HUP, :TTIN, :TTOU ]
> +
> +if worker_nr = ENV['UNICORN_WORKER_NR']
> +  @worker_nr = worker_nr.to_i
> +  @master_pid = Process.ppid
> +else
> +  @master_pid = $$
> +end
> +
>end

We should nil @worker_nr in the normal case to avoid introducing
new warnings on uninitialized ivars.

># Runs the thing.  Returns self so you can run join on it
> @@ -113,7 +121,6 @@ def start
>  # this pipe is used to wake us up from select(2) in #join when signals
>  # are trapped.  See trap_deferred.
>  @self_pipe.replace(Unicorn.pipe)
> -@master_pid = $$

Any particular reason setting @master_pid was moved to initialize?

Maybe we (or anybody else) has tests or other code which do:

  foo = Unicorn::HttpServer.new ...
  pid = fork do
foo.start.join
  end

Which would be broken by the movement from start => initialize

Again, I don't consider this a stable API and don't want people
depending on our internals; but I will also avoid breaking
existing behavior whenever possible.

> @@ -459,6 +466,39 @@ def reexec
>  proc_name 'master (old)'
>end
>  
> +  def worker_exec(worker)
> +listener_fds = {}
> +LISTENERS.each do |sock|
> +  sock.close_on_exec = false
> +  listener_fds[sock.fileno] = sock
> +end
> +ENV['UNICORN_FD'] = listener_fds.keys.join(',')
> +ENV['UNICORN_WORKER_NR'] = worker.nr.to_s

> +ENV['UNICORN_WORKER_TO_IO'] = worker.to_io.fileno.to_s
> +ENV['UNICORN_WORKER_MASTER'] = worker.master.fileno.to_s

We can cram these two FDs into one UNICORN_WORKER_FD
comma-delimited variable like UNICORN_FD.  The underlying
getenv/setenv/putenv functions are all O(n), so keeping env
smaller would be nice.

And we can even avoid modifying ENV on Ruby 1.9+ by
using exec(env, cmd, options...) or even Process.spawn (see below)

> +Dir.chdir(START_CTX[:cwd])
> +cmd = [ START_CTX[0] ].concat(START_CTX[:argv])
> +
> +listener_fds[worker.to_io.fileno] = worker.to_io
> +listener_fds[worker.master.fileno] = worker.master
> +
> +# avoid leaking FDs we don't know about, but let before_exec
> +# unset FD_CLOEXEC, if anything else in the app eventually
> +# relies on FD inheritence.

Comment seems blindly copy+pasted, since before_exec does not
seem to exist, for this...

> +(3..1024).each do |io|
> +  next if listener_fds.include?(io)
> +  io = IO.for_fd(io) rescue next
> +  io.autoclose = false
> +  io.close_on_exec = true
> +end

I suppose we should probably use Process.getrlimit, here,
maybe skip the loop entirely for Ruby 2.0.0+ (which made
close-on-exec the default).

This will be a separate patch...


Since there's no before_exec hook as with SIGUSR2, I think we
can do all of this setup in master and use Process.spawn instead
of fork+exec explicitly.  Process.spawn transparently allows the use
of vfork+exec to avoid fork overhead on some systems.

> +# exec(command, hash) works in at least 1.9.1+, but will only be
> +# required in 1.9.4/2.0.0 at earliest.

I guess it's time to update that comment, since 1.9.4 never
happened :)

> +cmd << listener_fds
> +logger.info "executing worker #{cmd.inspect} (in #{Dir.pwd})"
> +exec(*cmd)
> +  end
> +
># forcibly terminate all workers that haven't checked in in timeout 
> seconds.  The timeout is implemented using an unlinked File
>def murder_lazy_workers
>  next_sleep = @timeout - 1
> @@ -495,6 +535,12 @@ def after_fork_internal
>end
>  
>def spawn_missing_workers
> +if @worker_nr
> +  worker = Unicorn::Worker.new(@worker_nr, [ENV["UNICORN_WORKER_TO_IO"], 
> ENV["UNICORN_WORKER_MASTER"]].map{|i| Kgio::Pipe.for_fd(i.to_i)})

Please keep new code wrapped to 80 chars or less.  I'm still
using hardware that predates this project and my eyes will
only get worse with age.

> @@ -505,7 +551,11 @@ def spawn_missing_workers
>  worker.atfork_parent
>else
>  after_fork_internal
> -worker_loop(worker)
> +if @worker_exec
> +  worker_exec(wor

Re: [PATCH] check_client_connection: use tcp state on linux

2017-03-21 Thread Eric Wong
Simon Eskildsen  wrote:


> I would assume you would see TIME_WAIT and CLOSE. LAST_ACK_CLOSING it
> seems pretty unlikely to hit, but not impossible. As with CLOSING,
> I've included LAST_ACK_CLOSING for completeness.

Did you mean "LAST_ACK, and CLOSING"? (not joined by underscore)

Anyways, thanks for testing and adding

>  wrote:
> > Yep, we need to account for the UNIX socket case.  I forget if
> > kgio even makes them different...
> 
> I read the implementation and verified by dumping the class when
> testing on some test boxes. You are right—it's a simple Kgio::Socket
> object, not differentiating between Kgio::TCPSocket and
> Kgio::UnixSocket at the class level. Kgio only does this if they're
> explicitly passed to override the class returned from #try_accept.
> Unicorn doesn't do this.
> 
> I've tried to find a way to determine the socket domain (INET vs.
> UNIX) on the socket object, but neither Ruby's Socket class nor Kgio
> seems to expose this. I'm not entirely sure what the simplest way to
> do this check would be. We could have the accept loop pass the correct
> class to #try_accept based on the listening socket that came back from
> #accept. If we passed the listening socket to #read after accept, we'd
> know.. but I don't like that the request knows about the listener
> either. Alternatively, we could expose the socket domain in Kgio, but
> that'll be problematic in the near-ish future as you've mentioned
> wanting to move away from Kgio as Ruby's IO library is at parity as
> per Ruby 2.4.
> 
> What do you suggest pursuing here to check whether the client socket
> is a TCP socket?

I think passing the listening socket is the best way to go about
detecting whether a socket is INET or UNIX, for now.

You're right about kgio, I'd rather not make more changes to
kgio but we will still need it to for Ruby <2.2.x users.

And #read is an overloaded name, feel free to change it :)

> Below is a patch addressing the other concerns. I had to include
> require raindrops so the `defined?` check would do the right thing, as
> the only other file that requires Raindrops is the worker one which is
> loaded after http_request. I can change the load-order or require
> raindrops in lib/unicorn.rb if you prefer.

The require is fine.  However we do not need a class variable,
below...

>  # TODO: remove redundant names
>  Unicorn.const_set(:HttpRequest, Unicorn::HttpParser)
> @@ -29,6 +30,7 @@ class Unicorn::HttpParser
># 2.2+ optimizes hash assignments when used with literal string keys
>HTTP_RESPONSE_START = [ 'HTTP', '/1.1 ']
>@@input_class = Unicorn::TeeInput
> +  @@raindrops_tcp_info_defined = defined?(Raindrops::TCP_Info)

I prefer we avoid adding this cvar, instead...

>@@check_client_connection = false
> 
>def self.input_class
> @@ -83,11 +85,7 @@ def read(socket)
>false until add_parse(socket.kgio_read!(16384))
>  end
> 
> -# detect if the socket is valid by writing a partial response:
> -if @@check_client_connection && headers?
> -  self.response_start_sent = true
> -  HTTP_RESPONSE_START.each { |c| socket.write(c) }
> -end
> +check_client_connection(socket) if @@check_client_connection
> 
>  e['rack.input'] = 0 == content_length ?
>NULL_IO : @@input_class.new(socket, self)
> @@ -108,4 +106,27 @@ def call
>def hijacked?
>  env.include?('rack.hijack_io'.freeze)
>end

... we can have different methods defined:

   if defined?(Raindrops::TCP_Info) # Linux, maybe FreeBSD
 def check_client_connection(client, listener) # :nodoc:
 ...
 end
   else # portable version
 def check_client_connection(client, listener) # :nodoc:
 ...
 end
   end

And eliminate the class variable entirely.

> +
> +  private

I prefer to avoid marking methods as 'private' to ease any
ad-hoc unit testing which may come up.  Instead, rely on :nodoc:
directives to discourage people from depending on it.

Thanks.

> +  def check_client_connection(socket)
> +if @@raindrops_tcp_info_defined
> +  tcp_info = Raindrops::TCP_Info.new(socket)
> +  raise Errno::EPIPE, "client closed connection".freeze, [] if
> closed_state?(tcp_info.state)
> +elsif headers?
> +  self.response_start_sent = true
> +  HTTP_RESPONSE_START.each { |c| socket.write(c) }
> +end
> +  end
> +
> +  def closed_state?(state)
> +case state
> +when 1 # ESTABLISHED
> +  false
> +when 6, 7, 8, 9, 11 # TIME_WAIT, CLOSE, CLOSE_WAIT, LAST_ACK, CLOSING
> +  true
> +else
> +  false
> +end
> +  end
>  end

closed_state? looks good to me, good call on short-circuiting
the common case of ESTABLISHED!
--
unsubscribe: unicorn-public+unsubscr...@bogomips.org
archive: https://bogomips.org/unicorn-public/



Re: Patch: Add after_worker_ready configuration option V2

2017-03-21 Thread Eric Wong
Jeremy Evans  wrote:
> Here's V2 of the after_worker_ready patch.  This adds some more
> documentation, and tries to give a better description of the
> advantages in the commit message.

Thanks, I've pushed this and the chroot patch out to the
'chroot' branch.  Willl wait a bit for comments from others
before merging into 'master'.

The following changes since commit c8f06be298d667ba85573668ee916680a258c2c7:

  Fix code example in after_worker_exit documentation (2017-02-23 19:26:30 
+)

are available in the git repository at:

  git://bogomips.org/unicorn chroot

for you to fetch changes up to d322345251e15125df896bb8f0a5b53b49a1bf3f:

  Add after_worker_ready configuration option (2017-02-23 20:23:44 +)


Jeremy Evans (2):
  Add support for chroot to Worker#user
  Add after_worker_ready configuration option

 lib/unicorn/configurator.rb | 22 ++
 lib/unicorn/http_server.rb  |  4 ++--
 lib/unicorn/worker.rb   | 13 ++---
 3 files changed, 34 insertions(+), 5 deletions(-)
--
unsubscribe: unicorn-public+unsubscr...@bogomips.org
archive: https://bogomips.org/unicorn-public/



Re: Patch: Fix code example in after_worker_exit documentation

2017-03-21 Thread Eric Wong
Jeremy Evans  wrote:
> Here's a separate commit for the documentation fix.

Thanks, pushed as c8f06be298d667ba85573668ee916680a258c2c7
to 'master'.  I added a "Fixes: " tag to link it the original.

> From 6efb4b8e07a0459180579b72fac5003e424837c0 Mon Sep 17 00:00:00 2001

Hint: you can omit the above line, and add "8<" so the
applier can use "git am --scissors" without having to just part
of the message.  It's no big deal, but I guess I'll do what I
can to promote --scissors as it's not too well-known.
(and usually, the entire email is the patch for "git am")
--
unsubscribe: unicorn-public+unsubscr...@bogomips.org
archive: https://bogomips.org/unicorn-public/



Re: Patch: Add after_worker_ready configuration option

2017-03-21 Thread Eric Wong
Eric Wong  wrote:
> Also, were you planning to send a v2 for chroot support?

+ Or did you want me to squash the explicit path change in?

> I'm fine with either.
--
unsubscribe: unicorn-public+unsubscr...@bogomips.org
archive: https://bogomips.org/unicorn-public/



Re: Patch: Add support for chroot to Worker#user

2017-03-21 Thread Eric Wong
Jeremy Evans  wrote:
> Any chrooting would need to happen inside Worker#user, because
> you can't chroot until after you have parsed the list of groups,
> and you must chroot before dropping root privileges.
> 
> chroot is an important security feature, so that if the unicorn
> process is exploited, file system access is limited to the chroot
> directory instead of the entire system.

I'm hesitant to document this as "an important security feature".

Perhaps "an extra layer of security" is more accurate,
as there are ways of breaking out of a chroot.

> This is not a complete patch as it does not include tests. I can
> add tests if you would consider accepting this feature.

I wouldn't worry about automated tests if elevated privileges
are required.

> -  # Changes the worker process to the specified +user+ and +group+
> +  # Changes the worker process to the specified +user+ and +group+,
> +  # and chroots to the current working directory if +chroot+ is set.
># This is only intended to be called from within the worker
># process from the +after_fork+ hook.  This should be called in
># the +after_fork+ hook after any privileged functions need to be
> @@ -123,7 +124,7 @@ def close # :nodoc:
># directly back to the caller (usually the +after_fork+ hook.
># These errors commonly include ArgumentError for specifying an
># invalid user/group and Errno::EPERM for insufficient privileges
> -  def user(user, group = nil)
> +  def user(user, group = nil, chroot = false)
>  # we do not protect the caller, checking Process.euid == 0 is
>  # insufficient because modern systems have fine-grained
>  # capabilities.  Let the caller handle any and all errors.
> @@ -134,6 +135,10 @@ def user(user, group = nil)
>Process.initgroups(user, gid)
>Process::GID.change_privilege(gid)
>  end
> +if chroot
> +  Dir.chroot(Dir.pwd)
> +  Dir.chdir('/')
> +end

Perhaps this can be made more flexible by allowing chroot to
any specified directory, not just pwd.  Something like:

chroot = Dir.pwd if chroot == true
if chroot
  Dir.chroot(chroot)
  Dir.chdir('/')
end

Anyways I'm inclined to accept this feature.
--
unsubscribe: unicorn-public+unsubscr...@bogomips.org
archive: https://bogomips.org/unicorn-public/



Re: Patch: Add after_worker_ready configuration option

2017-03-21 Thread Eric Wong
Jeremy Evans  wrote:

 acknowledging everything above.

> On 02/23 02:32, Eric Wong wrote:
> > Jeremy Evans  wrote:
> > Anyways, I'm somewhat inclined to accept this as well, but will
> > think about it a bit, too.
> > 
> > One problem I have with this change is that it requires users
> > read more documentation and know more caveats to use.
>  
> It's unfortunate, but I think it's necessary in this case. Unicorn's
> default behavior works for most users, but this makes it so you
> don't have to override HttpServer#init_worker_process to get similar
> functionality.

Yeah.  Unfortunately, with every option comes more
support/maintenance overhead.  So I'm trying to make sure we've
explored every possible avenue before adding this new option.



> > (*) Fwiw, I wasn't thrilled to add user switching to unicorn back
> > in 2009/2010, as it should not need to run privileged ports.
> 
> I think had you not added it, there wouldn't be a need for a chroot
> patch, since otherwise one would assume any user that wanted to
> drop privileges would just run the necessary code themselves.
> 
> One thing to keep in mind is that if you did not add the user switching
> code, developers who need it would implement it themselves, and would
> likely miss things like initgroups.

Heh.  As I was mentioning above, every feature can come with
new, unintended consequences :>  But yeah, maybe push things
up to Ruby stdlib, or systemd...

> > So I'm wondering if there can be some of that which could be
> > trimmed out someday or made optional, somehow.
> 
> I'm not sure how open you are to a plugin-like system for Unicorn, where
> you can have separate files for separate features, and users can choose
> which features the want to use.  This limits memory usage so that you
> don't pay a memory cost for features you don't use.  This is the
> approach used by Sequel, Roda, and Rodauth (other libraries I maintain),
> and if you are interested in a similar system for Unicorn, I would be
> happy to work on one.

Not really.  As I've stated many times in the past, unicorn is
just one of many servers available for Rack users; so I've been
against people building too many dependencies on unicorn, that
comes at the expense of other servers.

What I had in mind was having a shim process which does socket
setup, user switching, etc. before exec-ing unicorn in cases
where a systemd-like spawner is not used.  But it's probably too
small to make an impact on real apps anyways, since Ruby still
has tons of methods which never get called, either.

> I do have a couple of additional patches I plan to send, but I think it
> may be better to ask here first.
> 
> One is that SIGUSR1 handling doesn't work well with chroot if the log
> file is outside of the chroot, since the worker process may not have
> access to reopen the file.  One simple fix is to reopen the logs in
> the master process, then send SIGQUIT instead of SIGUSR1 to the child
> processes.  Are you open to a patch that does that?

Right now, the worker exits with some log spew if reopening
logs fails, and the worker respawns anyways, correct?

Perhaps that mapping can be made automatic if chroot is
enabled.  I wonder if that's too much magic...
Or the log spew could be suppressed when the error
is ENOENT and chroot is enabled.

The sysadmin could also send SIGHUP to reload everything
and respawn all children.

So, I'm wondering if using SIGHUP is too much of a burden, or if
automatic USR1 => QUIT mapping for chroot users is too magical.

(trying to avoid adding more options + documentation, as always :)

> The second feature is more exotic security feature called fork+exec.
> The current issue with unicorn's direct forking of children is that
> forked children share the memory layout of the server.  This is good
> from a memory usage standpoint due to CoW memory.  However, if there
> is a memory vulnerability in the unicorn application, this makes it
> easier to exploit, since if the a unicorn worker process crashes, the
> master process will fork a child with pretty much the same memory
> layout.  This allows address space discovery attacks.  Using fork+exec,
> each child process would have a unique address space for ASLR, on
> most Unix-like systems (Linux, MacOS X, NetBSD, OpenBSD, Solaris).
> There are additional security advantages from fork+exec on OpenBSD, but
> all Unix-like systems that use ASLR can benefit from the unique
> memory layout per process.

It depends how intrusive the implementation is...

Is this merely calling exec in the worker?

I'm not sure if your use of "fork+exec" is merely the two
well-known syscalls combined, or if there's something else;
I'll assume the former, since that's enough to have a
unique address space for Ruby bytecode.

Can this be done in an after_fork/after_worker_ready hook?

Socket inheritance can be done the same way it handles USR2
upgrades and systemd socket inheritance.
--
unsubscribe: unicorn-public+unsubscr...@bogomips.org
a

Re: check_client_connection using getsockopt(2)

2017-03-21 Thread Eric Wong
Simon Eskildsen  wrote:

 great to know it's still working after all these years :>

> This confirms Eric's comment that the existing
> `check_client_connection` works perfectly on loopback, but as soon as
> you put an actual network between the Unicorn and client—it's only
> effective 20% of the time, even with `TCP_NODELAY`. I'm assuming, due
> to buffering, even when disabling Nagle's. As we're changing our
> architecture, we move from ngx (lb) -> ngx (host) -> unicorn to ngx
> (lb) -> unicorn. That means this patch will no longer work for us.

Side comment: I'm a bit curious how you guys arrived at removing
nginx at the host level (if you're allowed to share that info)

I've mainly kept nginx on the same host as unicorn, but used
haproxy or similar (with minimal buffering) at the LB level.
That allows better filesystem load distribution for large uploads.

> I propose instead of the early `write(2)` hack, that we use
> `getsockopt(2)` with the `TCP_INFO` flag and read the state of the
> socket. If it's in `CLOSE_WAIT` or `CLOSE`, we kill the connection and
> move on to the next.
> 
> https://github.com/torvalds/linux/blob/8fa3b6f9392bf6d90cb7b908e07bd90166639f0a/include/uapi/linux/tcp.h#L163
> 
> This is not going to be portable, but we can do this on only Linux
> which I suspect is where most production deployments of Unicorn that
> would benefit from this feature run. It's not useful in development
> (which is likely more common to not be Linux). We could also introduce
> it under a new configuration option if desired. In my testing, this
> works to reject 100% of requests early when not on loopback.

Good to know about it's effectiveness!  I don't mind adding
Linux-only features as long as existing functionality still
works on the server-oriented *BSDs.

> The code is essentially:
> 
> def client_closed?
>   tcp_info = socket.getsockopt(Socket::SOL_TCP, Socket::TCP_INFO)
>   state = tcp_info.unpack("c")[0]
>   state == TCP_CLOSE || state == TCP_CLOSE_WAIT
> end

+Cc: raindrops-pub...@bogomips.org -> https://bogomips.org/raindrops-public/

Fwiw, raindrops (already a dependency of unicorn) has TCP_INFO
support under Linux:

https://bogomips.org/raindrops.git/tree/ext/raindrops/linux_tcp_info.c

I haven't used it, much, but FreeBSD also implements a subset of
this feature, nowadays, and ought to be supportable, too.  We
can look into adding it for raindrops.

I don't know about other BSDs...

> This could be called at the top of `#process_client` in `http_server.rb`.
> 
> Would there be interest in this upstream? Any comments on this
> proposed implementation? Currently, we're using a middleware with the
> Rack hijack API, but this seems like it belongs at the webserver
> level.

I guess hijack means you have to discard other middlewares and
the normal rack response handling in unicorn?  If so, ouch!

Unfortunately, without hijack, there's no portable way in Rack
to get at the underlying IO object; so I guess this needs to
be done at the server level...

So yeah, I guess it belongs in the webserver.

I think we can automatically use TCP_INFO if available for
check_client_connection; then fall back to the old early write
hack for Unix sockets and systems w/o TCP_INFO (which would set
the response_start_sent flag).

No need to introduce yet another configuration option.
--
unsubscribe: unicorn-public+unsubscr...@bogomips.org
archive: https://bogomips.org/unicorn-public/



Re: Patch: Add after_worker_ready configuration option V2

2017-03-21 Thread Eric Wong
Eric Wong  wrote:
> Thanks, I've pushed this and the chroot patch out to the
> 'chroot' branch.  Willl wait a bit for comments from others
> before merging into 'master'.

No comments, so no objections, so merged and pushed to 'master'
as commit ff13ad38ba9f83e0dd298be451aac7c75145d33b

Merge remote-tracking branch 'origin/chroot'

* origin/chroot:
  Add after_worker_ready configuration option
  Add support for chroot to Worker#user

Thanks.
--
unsubscribe: unicorn-public+unsubscr...@bogomips.org
archive: https://bogomips.org/unicorn-public/



Re: [PATCH] Add worker_exec configuration option

2017-03-21 Thread Eric Wong
Jeremy Evans  wrote:
> Thanks for your detailed review.  I will work on an updated patch and
> try to send it tomorrow or Friday.

No problem and no rush; I might be out a bit the next few days.
--
unsubscribe: unicorn-public+unsubscr...@bogomips.org
archive: https://bogomips.org/unicorn-public/



[PATCH] http_server: initialize @pid ivar

2017-03-21 Thread Eric Wong
This quiets down warnings when run with '-w'
---
 lib/unicorn/http_server.rb | 1 +
 1 file changed, 1 insertion(+)

diff --git a/lib/unicorn/http_server.rb b/lib/unicorn/http_server.rb
index ef897ad..4a2ccce 100644
--- a/lib/unicorn/http_server.rb
+++ b/lib/unicorn/http_server.rb
@@ -89,6 +89,7 @@ def initialize(app, options = {})
 @self_pipe = []
 @workers = {} # hash maps PIDs to Workers
 @sig_queue = [] # signal queue used for self-piping
+@pid = nil
 
 # we try inheriting listeners first, so we bind them later.
 # we don't write the pid file until we've bound listeners in case
-- 
EW

--
unsubscribe: unicorn-public+unsubscr...@bogomips.org
archive: https://bogomips.org/unicorn-public/



Re: Patch: Add after_worker_ready configuration option

2017-03-21 Thread Jeremy Evans
On 02/23 02:32, Eric Wong wrote:
> Jeremy Evans  wrote:
> > This adds a hook that is called after the application has
> > been loaded by the worker process, directly before it starts
> > accepting requests.  This hook is necessary if your application
> > needs to get gain access to resources during initialization,
> > and then drop privileges before serving requests.
> > 
> > This is especially useful in conjunction with chroot support
> > so the app can load all the normal ruby libraries it needs
> > to function, and then chroot before accepting requests.
> > 
> > If you are preloading the app, it's possible to drop privileges
> > or chroot in after_fork, but if you are not preloading the app,
> > there is not currently a properly place to handle this, hence
> > the reason for this feature.
> 
> This means using Unicorn::Worker#user directly,
> and not Unicorn::Configurator#user.  OK...

Correct.  We could add configuration support to specify when to call
Worker#user, but that seems like overkill.
 
> Also, were you planning to send a v2 for chroot support?
> I'm fine with either.

Sorry about that, I will send a v2 for chroot support tomorrow.

> 
> > --- a/lib/unicorn/configurator.rb
> > +++ b/lib/unicorn/configurator.rb
> 
> > @@ -162,7 +165,7 @@ def after_fork(*args, &block)
> ># sets after_worker_exit hook to a given block.  This block will be 
> > called
> ># by the master process after a worker exits:
> >#
> > -  #  after_fork do |server,worker,status|
> > +  #  after_worker_exit do |server,worker,status|
> >## status is a Process::Status instance for the exited worker process
> >#unless status.success?
> >#  server.logger.error("worker process failure: #{status.inspect}")
> 
> This looks like an unrelated hunk which belongs in a separate
> patch.  I can squeeze in a separate fix for this (credited to
> you) or you can send a separate patch.
 
Correct, this was an unrelated bug in my last patch.  I'll separate this
hunk into a separate patch.

> > @@ -172,6 +175,18 @@ def after_worker_exit(*args, &block)
> >  set_hook(:after_worker_exit, block_given? ? block : args[0], 3)
> >end
> >  
> > +  # sets after_worker_ready hook to a given block.  This block will be 
> > called
> > +  # by a worker process after it has been fully loaded, directly before it
> > +  # starts responding to requests:
> > +  #
> > +  #  after_worker_ready do |server,worker|
> > +  #server.logger.info("worker #{worker.nr} ready, dropping privileges")
> > +  #worker.user('username', 'groupname')
> > +  #  end
> 
> The documentation should probably say something along the lines
> of:
> 
> Do not use Unicorn::Configurator#user if you rely on
> changing users in the after_worker_ready hook.
> 
> And maybe corresponding documentation for Unicorn::Configurator#user,
> too.

OK, I'll update the documentation to be more complete.

> Anyways, I'm somewhat inclined to accept this as well, but will
> think about it a bit, too.
> 
> One problem I have with this change is that it requires users
> read more documentation and know more caveats to use.
 
It's unfortunate, but I think it's necessary in this case. Unicorn's
default behavior works for most users, but this makes it so you
don't have to override HttpServer#init_worker_process to get similar
functionality.

> Adding to that, Unicorn::Configurator#user has been favored
> (documentation-wise) over Unicorn::Worker#user in the past;
> so it's a bit of a reversal(*)

I think Configurator#user still makes sense for most users.  As the
documentation mentions, directly calling Worker#user should only
be done if Configurator#user is not sufficient.

> Based on the series of changes you're making, I'm sensing you're
> working on some complex system which might need more security
> than a typical app.

These patches are related to all applications using Unicorn that I run
on OpenBSD (about 20 separate applications).  They are designed to
to reduce the attack surface if an attacker is able to find a
remote code execution vulnerability in one of the applications.

In addition to chroot(2), after chrooting, these applications also
use pledge(2), which is an OpenBSD-specific system call filter
(similar in principle to seccomp(2) in Linux).

> I know you use OpenBSD, and I don't know much about the init
> system or how deployment works, there.
 
It's a very simple sh(1) based-init system, similar to traditional Unix.
Most OpenBSD system daemons will start as root, chroot to a specific
directory, drop privileges (possibly separate privileges in separate
daemon processes connected via IPC), and finally use pledge(2) to limit
what each process can actually use in terms of system calls.

> Nowadays, it seems more common things (binding sockets,
> switching users, chrooting, etc..) is handled by init systems
> (at least systemd); and having that functionality in unicorn
> would mean redundant bloat.

I believe only Linux uses a systemd-like appr

[PATCH (ccc)] http_request: support proposed Raindrops::TCP states on

2017-03-21 Thread Eric Wong
TCP states names/numbers seem stable for each OS, but differ in
name and numeric values between them.  So I started upon this
patch series for raindrops last week:

  https://bogomips.org/raindrops-public/20170316031652.17433-...@80x24.org/T/

And things seem to be alright, for the most part.  Anyways
here's a proposed patch for unicorn which takes advantage of
the above (proposed) changes for raindrops and will allow
unicorn to support check_client_connection

Also Cc:-ing Jeremy to see if he has any input on the OpenBSD
side of things.

This goes on top of commit 20c66dbf1ebd0ca993e7a79c9d0d833d747df358
("http_request: reduce insn size for check_client_connection")
at: git://bogomips.org/unicorn ccc-tcp-v3

---8<
Subject: [PATCH] http_request: support proposed Raindrops::TCP states on
 non-Linux

raindrops 0.18+ will have Raindrops::TCP state hash for portable
mapping of TCP states to their respective numeric values.  This
was necessary because TCP state numbers (and even macro names)
differ between FreeBSD and Linux (and possibly other OSes).

Favor using the Raindrops::TCP state hash if available, but
fall back to the hard-coded values since older versions of
raindrops did not support TCP_INFO on non-Linux systems.

While we're in the area, favor "const_defined?" over "defined?"
to reduce the inline constant cache footprint for branches
which are only evaluated once.

Patches to implement Raindrops::TCP for FreeBSD are available at:

  https://bogomips.org/raindrops-public/20170316031652.17433-...@80x24.org/T/
---
 lib/unicorn/http_request.rb | 38 +-
 1 file changed, 29 insertions(+), 9 deletions(-)

diff --git a/lib/unicorn/http_request.rb b/lib/unicorn/http_request.rb
index 9010007..7253497 100644
--- a/lib/unicorn/http_request.rb
+++ b/lib/unicorn/http_request.rb
@@ -105,7 +105,7 @@ def hijacked?
 env.include?('rack.hijack_io'.freeze)
   end
 
-  if defined?(Raindrops::TCP_Info)
+  if Raindrops.const_defined?(:TCP_Info)
 TCPI = Raindrops::TCP_Info.allocate
 
 def check_client_connection(socket) # :nodoc:
@@ -118,14 +118,34 @@ def check_client_connection(socket) # :nodoc:
   end
 end
 
-def closed_state?(state) # :nodoc:
-  case state
-  when 1 # ESTABLISHED
-false
-  when 8, 6, 7, 9, 11 # CLOSE_WAIT, TIME_WAIT, CLOSE, LAST_ACK, CLOSING
-true
-  else
-false
+if Raindrops.const_defined?(:TCP)
+  # raindrops 0.18.0+ supports FreeBSD + Linux using the same names
+  # Evaluate these hash lookups at load time so we can
+  # generate an opt_case_dispatch instruction
+  eval <<-EOS
+  def closed_state?(state) # :nodoc:
+case state
+when #{Raindrops::TCP[:ESTABLISHED]}
+  false
+when #{Raindrops::TCP.values_at(
+  :CLOSE_WAIT, :TIME_WAIT, :CLOSE, :LAST_ACK, :CLOSING).join(',')}
+  true
+else
+  false
+end
+  end
+  EOS
+else
+  # raindrops before 0.18 only supported TCP_INFO under Linux
+  def closed_state?(state) # :nodoc:
+case state
+when 1 # ESTABLISHED
+  false
+when 8, 6, 7, 9, 11 # CLOSE_WAIT, TIME_WAIT, CLOSE, LAST_ACK, CLOSING
+  true
+else
+  false
+end
   end
 end
   else
-- 
EW
--
unsubscribe: unicorn-public+unsubscr...@bogomips.org
archive: https://bogomips.org/unicorn-public/



Re: [PATCH] Add worker_exec configuration option V2

2017-03-21 Thread Eric Wong
Jeremy Evans  wrote:
> Here's V2 of the patch, which I think should address all of the issues
> you pointed out.

Thanks.  Pushed to git://80x24.org/unicorn worker_exec I can add
tests, later, or you can.  I also had some FreeBSD test fixes
(which might apply to OpenBSD) on a VM somewhere which I'll Cc:
you on: there was also just SO_KEEPALIVE fix I posted:

  https://bogomips.org/unicorn-public/20170310203431.28067-...@80x24.org/raw

>  worker_nr = -1
>  until (worker_nr += 1) == @worker_processes
>@workers.value?(worker_nr) and next
>worker = Unicorn::Worker.new(worker_nr)
>before_fork.call(self, worker)
> -  if pid = fork
> -@workers[pid] = worker
> -worker.atfork_parent
> +
> +  pid = if @worker_exec
> +worker_spawn(worker)
>else
> -after_fork_internal
> -worker_loop(worker)
> -exit
> +fork do
> +  after_fork_internal
> +  worker_loop(worker)
> +  exit
> +end

I prefer to avoid the block with fork.  The block deepens the
stack for the running app, so it can affect GC efficiency.

Can be fixed in a separate patch...
--
unsubscribe: unicorn-public+unsubscr...@bogomips.org
archive: https://bogomips.org/unicorn-public/



Re: [PATCH] check_client_connection: use tcp state on linux

2017-03-21 Thread Simon Eskildsen
Here's another update Eric!

* Use a frozen empty array and a class variable for TCP_Info to avoid
garbage. As far as I can tell, this shouldn't result in any garbage on
any requests (other than on the first request).
* Pass listener socket to #read to only check the client connection on
a TCP server.
* Short circuit CLOSE_WAIT after ESTABLISHED since in my testing it's
the most common state after ESTABLISHED, it makes the numbers
un-ordered, though. But comment should make it OK.
* Definition of of `check_client_connection` based on whether
Raindrops::TCP_Info is defined, instead of the class variable
approach.
* Changed the unit tests to pass a `nil` listener.

Tested on our staging environment, and still works like a dream.

I should note that I got the idea between this patch into Puma as well!

https://github.com/puma/puma/pull/1227


---
 lib/unicorn/http_request.rb | 44 ++--
 lib/unicorn/http_server.rb  |  6 +++---
 test/unit/test_request.rb   | 28 ++--
 3 files changed, 55 insertions(+), 23 deletions(-)

diff --git a/lib/unicorn/http_request.rb b/lib/unicorn/http_request.rb
index 0c1f9bb..21a99ef 100644
--- a/lib/unicorn/http_request.rb
+++ b/lib/unicorn/http_request.rb
@@ -2,6 +2,7 @@
 # :enddoc:
 # no stable API here
 require 'unicorn_http'
+require 'raindrops'

 # TODO: remove redundant names
 Unicorn.const_set(:HttpRequest, Unicorn::HttpParser)
@@ -28,6 +29,7 @@ class Unicorn::HttpParser
   # Drop these frozen strings when Ruby 2.2 becomes more prevalent,
   # 2.2+ optimizes hash assignments when used with literal string keys
   HTTP_RESPONSE_START = [ 'HTTP', '/1.1 ']
+  EMPTY_ARRAY = [].freeze
   @@input_class = Unicorn::TeeInput
   @@check_client_connection = false

@@ -62,7 +64,7 @@ def self.check_client_connection=(bool)
   # returns an environment hash suitable for Rack if successful
   # This does minimal exception trapping and it is up to the caller
   # to handle any socket errors (e.g. user aborted upload).
-  def read(socket)
+  def read(socket, listener)
 clear
 e = env

@@ -83,11 +85,7 @@ def read(socket)
   false until add_parse(socket.kgio_read!(16384))
 end

-# detect if the socket is valid by writing a partial response:
-if @@check_client_connection && headers?
-  self.response_start_sent = true
-  HTTP_RESPONSE_START.each { |c| socket.write(c) }
-end
+check_client_connection(socket, listener) if @@check_client_connection

 e['rack.input'] = 0 == content_length ?
   NULL_IO : @@input_class.new(socket, self)
@@ -108,4 +106,38 @@ def call
   def hijacked?
 env.include?('rack.hijack_io'.freeze)
   end
+
+  if defined?(Raindrops::TCP_Info)
+def check_client_connection(socket, listener) # :nodoc:
+  if Kgio::TCPServer === listener
+@@tcp_info ||= Raindrops::TCP_Info.new(socket)
+@@tcp_info.get!(socket)
+raise Errno::EPIPE, "client closed connection".freeze,
EMPTY_ARRAY if closed_state?(@@tcp_info.state)
+  else
+write_http_header(socket)
+  end
+end
+
+def closed_state?(state) # :nodoc:
+  case state
+  when 1 # ESTABLISHED
+false
+  when 8, 6, 7, 9, 11 # CLOSE_WAIT, TIME_WAIT, CLOSE, LAST_ACK, CLOSING
+true
+  else
+false
+  end
+end
+  else
+def check_client_connection(socket, listener) # :nodoc:
+  write_http_header(socket)
+end
+  end
+
+  def write_http_header(socket) # :nodoc:
+if headers?
+  self.response_start_sent = true
+  HTTP_RESPONSE_START.each { |c| socket.write(c) }
+end
+  end
 end
diff --git a/lib/unicorn/http_server.rb b/lib/unicorn/http_server.rb
index 35bd100..4190641 100644
--- a/lib/unicorn/http_server.rb
+++ b/lib/unicorn/http_server.rb
@@ -558,8 +558,8 @@ def e100_response_write(client, env)

   # once a client is accepted, it is processed in its entirety here
   # in 3 easy steps: read request, call app, write app response
-  def process_client(client)
-status, headers, body = @app.call(env = @request.read(client))
+  def process_client(client, listener)
+status, headers, body = @app.call(env = @request.read(client, listener))

 begin
   return if @request.hijacked?
@@ -655,7 +655,7 @@ def worker_loop(worker)
 # Unicorn::Worker#kgio_tryaccept is not like accept(2) at all,
 # but that will return false
 if client = sock.kgio_tryaccept
-  process_client(client)
+  process_client(client, sock)
   nr += 1
   worker.tick = time_now.to_i
 end
diff --git a/test/unit/test_request.rb b/test/unit/test_request.rb
index f0ccaf7..dbe8af7 100644
--- a/test/unit/test_request.rb
+++ b/test/unit/test_request.rb
@@ -30,7 +30,7 @@ def setup
   def test_options
 client = MockRequest.new("OPTIONS * HTTP/1.1\r\n" \
  "Host: foo\r\n\r\n")
-env = @request.read(client)
+env = @request.read(client, nil)
   

Patch: Add support for chroot to Worker#user

2017-03-21 Thread Jeremy Evans
Any chrooting would need to happen inside Worker#user, because
you can't chroot until after you have parsed the list of groups,
and you must chroot before dropping root privileges.

chroot is an important security feature, so that if the unicorn
process is exploited, file system access is limited to the chroot
directory instead of the entire system.

This is not a complete patch as it does not include tests. I can
add tests if you would consider accepting this feature.
---
 lib/unicorn/worker.rb | 11 ---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/lib/unicorn/worker.rb b/lib/unicorn/worker.rb
index 6748a2f..db418ea 100644
--- a/lib/unicorn/worker.rb
+++ b/lib/unicorn/worker.rb
@@ -111,9 +111,10 @@ def close # :nodoc:
   # In most cases, you should be using the Unicorn::Configurator#user
   # directive instead.  This method should only be used if you need
   # fine-grained control of exactly when you want to change permissions
-  # in your after_fork hooks.
+  # in your after_fork hooks, or if you want to use the chroot support.
   #
-  # Changes the worker process to the specified +user+ and +group+
+  # Changes the worker process to the specified +user+ and +group+,
+  # and chroots to the current working directory if +chroot+ is set.
   # This is only intended to be called from within the worker
   # process from the +after_fork+ hook.  This should be called in
   # the +after_fork+ hook after any privileged functions need to be
@@ -123,7 +124,7 @@ def close # :nodoc:
   # directly back to the caller (usually the +after_fork+ hook.
   # These errors commonly include ArgumentError for specifying an
   # invalid user/group and Errno::EPERM for insufficient privileges
-  def user(user, group = nil)
+  def user(user, group = nil, chroot = false)
 # we do not protect the caller, checking Process.euid == 0 is
 # insufficient because modern systems have fine-grained
 # capabilities.  Let the caller handle any and all errors.
@@ -134,6 +135,10 @@ def user(user, group = nil)
   Process.initgroups(user, gid)
   Process::GID.change_privilege(gid)
 end
+if chroot
+  Dir.chroot(Dir.pwd)
+  Dir.chdir('/')
+end
 Process.euid != uid and Process::UID.change_privilege(uid)
 @switched = true
   end
-- 
2.11.0

--
unsubscribe: unicorn-public+unsubscr...@bogomips.org
archive: https://bogomips.org/unicorn-public/



Patch: Add after_worker_ready configuration option

2017-03-21 Thread Jeremy Evans
This adds a hook that is called after the application has
been loaded by the worker process, directly before it starts
accepting requests.  This hook is necessary if your application
needs to get gain access to resources during initialization,
and then drop privileges before serving requests.

This is especially useful in conjunction with chroot support
so the app can load all the normal ruby libraries it needs
to function, and then chroot before accepting requests.

If you are preloading the app, it's possible to drop privileges
or chroot in after_fork, but if you are not preloading the app,
there is not currently a properly place to handle this, hence
the reason for this feature.
---
 lib/unicorn/configurator.rb | 17 -
 lib/unicorn/http_server.rb  |  4 ++--
 2 files changed, 18 insertions(+), 3 deletions(-)

diff --git a/lib/unicorn/configurator.rb b/lib/unicorn/configurator.rb
index 5bad925..6df5a30 100644
--- a/lib/unicorn/configurator.rb
+++ b/lib/unicorn/configurator.rb
@@ -49,6 +49,9 @@ class Unicorn::Configurator
   server.logger.error(m)
 end
   },
+:after_worker_ready => lambda { |server, worker|
+server.logger.info("worker=#{worker.nr} ready")
+  },
 :pid => nil,
 :preload_app => false,
 :check_client_connection => false,
@@ -162,7 +165,7 @@ def after_fork(*args, &block)
   # sets after_worker_exit hook to a given block.  This block will be called
   # by the master process after a worker exits:
   #
-  #  after_fork do |server,worker,status|
+  #  after_worker_exit do |server,worker,status|
   ## status is a Process::Status instance for the exited worker process
   #unless status.success?
   #  server.logger.error("worker process failure: #{status.inspect}")
@@ -172,6 +175,18 @@ def after_worker_exit(*args, &block)
 set_hook(:after_worker_exit, block_given? ? block : args[0], 3)
   end
 
+  # sets after_worker_ready hook to a given block.  This block will be called
+  # by a worker process after it has been fully loaded, directly before it
+  # starts responding to requests:
+  #
+  #  after_worker_ready do |server,worker|
+  #server.logger.info("worker #{worker.nr} ready, dropping privileges")
+  #worker.user('username', 'groupname')
+  #  end
+  def after_worker_ready(*args, &block)
+set_hook(:after_worker_ready, block_given? ? block : args[0])
+  end
+
   # sets before_fork got be a given Proc object.  This Proc
   # object will be called by the master process before forking
   # each worker.
diff --git a/lib/unicorn/http_server.rb b/lib/unicorn/http_server.rb
index c2086cb..ef897ad 100644
--- a/lib/unicorn/http_server.rb
+++ b/lib/unicorn/http_server.rb
@@ -15,7 +15,7 @@ class Unicorn::HttpServer
 :before_fork, :after_fork, :before_exec,
 :listener_opts, :preload_app,
 :orig_app, :config, :ready_pipe, :user
-  attr_writer   :after_worker_exit
+  attr_writer   :after_worker_exit, :after_worker_ready
 
   attr_reader :pid, :logger
   include Unicorn::SocketHelper
@@ -644,7 +644,7 @@ def worker_loop(worker)
 trap(:USR1) { nr = -65536 }
 
 ready = readers.dup
-@logger.info "worker=#{worker.nr} ready"
+@after_worker_ready.call(self, worker)
 
 begin
   nr < 0 and reopen_worker_logs(worker.nr)
-- 
2.11.0

--
unsubscribe: unicorn-public+unsubscr...@bogomips.org
archive: https://bogomips.org/unicorn-public/



Re: Patch: Add after_worker_ready configuration option

2017-03-21 Thread Eric Wong
Jeremy Evans  wrote:
> This adds a hook that is called after the application has
> been loaded by the worker process, directly before it starts
> accepting requests.  This hook is necessary if your application
> needs to get gain access to resources during initialization,
> and then drop privileges before serving requests.
> 
> This is especially useful in conjunction with chroot support
> so the app can load all the normal ruby libraries it needs
> to function, and then chroot before accepting requests.
> 
> If you are preloading the app, it's possible to drop privileges
> or chroot in after_fork, but if you are not preloading the app,
> there is not currently a properly place to handle this, hence
> the reason for this feature.

This means using Unicorn::Worker#user directly,
and not Unicorn::Configurator#user.  OK...

Also, were you planning to send a v2 for chroot support?
I'm fine with either.

> --- a/lib/unicorn/configurator.rb
> +++ b/lib/unicorn/configurator.rb

> @@ -162,7 +165,7 @@ def after_fork(*args, &block)
># sets after_worker_exit hook to a given block.  This block will be called
># by the master process after a worker exits:
>#
> -  #  after_fork do |server,worker,status|
> +  #  after_worker_exit do |server,worker,status|
>## status is a Process::Status instance for the exited worker process
>#unless status.success?
>#  server.logger.error("worker process failure: #{status.inspect}")

This looks like an unrelated hunk which belongs in a separate
patch.  I can squeeze in a separate fix for this (credited to
you) or you can send a separate patch.

> @@ -172,6 +175,18 @@ def after_worker_exit(*args, &block)
>  set_hook(:after_worker_exit, block_given? ? block : args[0], 3)
>end
>  
> +  # sets after_worker_ready hook to a given block.  This block will be called
> +  # by a worker process after it has been fully loaded, directly before it
> +  # starts responding to requests:
> +  #
> +  #  after_worker_ready do |server,worker|
> +  #server.logger.info("worker #{worker.nr} ready, dropping privileges")
> +  #worker.user('username', 'groupname')
> +  #  end

The documentation should probably say something along the lines
of:

Do not use Unicorn::Configurator#user if you rely on
changing users in the after_worker_ready hook.

And maybe corresponding documentation for Unicorn::Configurator#user,
too.

Anyways, I'm somewhat inclined to accept this as well, but will
think about it a bit, too.

One problem I have with this change is that it requires users
read more documentation and know more caveats to use.

Adding to that, Unicorn::Configurator#user has been favored
(documentation-wise) over Unicorn::Worker#user in the past;
so it's a bit of a reversal(*)

Based on the series of changes you're making, I'm sensing you're
working on some complex system which might need more security
than a typical app.

I know you use OpenBSD, and I don't know much about the init
system or how deployment works, there.

Nowadays, it seems more common things (binding sockets,
switching users, chrooting, etc..) is handled by init systems
(at least systemd); and having that functionality in unicorn
would mean redundant bloat.

It would be useful to me and folks who won't use these features
to understand how and why they're used.  That can help justify
the (admittedly small) memory overhead which comes with it.

Thanks.


(*) Fwiw, I wasn't thrilled to add user switching to unicorn back
in 2009/2010, as it should not need to run privileged ports.
So I'm wondering if there can be some of that which could be
trimmed out someday or made optional, somehow.
--
unsubscribe: unicorn-public+unsubscr...@bogomips.org
archive: https://bogomips.org/unicorn-public/



Re: check_client_connection using getsockopt(2)

2017-03-21 Thread Simon Eskildsen
On Wed, Feb 22, 2017 at 8:42 PM, Eric Wong  wrote:
> Simon Eskildsen  wrote:
>
>  Thanks for the writeup!
>
> Another sidenote: It seems nginx <-> unicorn is a bit odd
> for deployment in a containerized environment(*).
>
>> I meant to ask, in Raindrops why do you use the netlink API to get the
>> socket backlog instead of `getsockopt(2)` with `TCP_INFO` to get
>> `tcpi_unacked`? (as described in
>> http://www.ryanfrantz.com/posts/apache-tcp-backlog/) We use this to
>> monitor socket backlogs with a sidekick Ruby daemon. Although we're
>> looking to replace it with a middleware to simplify for Kubernetes.
>> It's one of our main metrics for monitoring performance, especially
>> around deploys.
>
> The netlink API allows independently-spawned processes to
> monitor others; so it can be system-wide.  TCP_INFO requires the
> process doing the checking to also have the socket open.
>
> I guess this reasoning for using netlink is invalid for containers,
> though...

If you namespace the network it's problematic, yeah. I'm considering
right now putting Raindrops in a middleware with the netlink API
inside the container, but it feels weird. That said, if you consider
the alternative of using `getsockopt(2)` on the listening socket, I
don't know how you'd get access to the Unicorn listening socket from a
middleware. Would it be nuts to expose a hook in Unicorn that allows
periodic execution for monitoring listening stats from Raindrops on
the listening socket? It seems somewhat of a narrow use-case, but on
the other hand I'm also not a fan of doing
`Raindrops::Linux.tcp_listener_stats("localhost:#{ENV["PORT"}")`, but
that might be less ugly.

>
>> I was going to use `env["unicorn.socket"]`/`env["puma.socket"]`, but
>> you could also do `env.delete("hijack_io")` after hijacking to allow
>> Unicorn to still render the response. Unfortunately the
>> `.socket` key is not part of the Rack standard, so I'm
>> hesitant to use it. When this gets into Unicorn I'm planning to
>> propose it upstream to Puma as well.
>
> I was going to say env.delete("rack.hijack_io") is dangerous
> (since env could be copied by middleware); but apparently not:
> rack.hijack won't work with a copied env, either.
> You only need to delete with the same env object you call
> rack.hijack with.
>
> But calling rack.hijack followed by env.delete may still
> have unintended side-effects in other servers; so I guess
> we (again) cannot rely on hijack working portably.

Exactly, it gives the illusion of portability but e.g. Puma stores an
instance variable to check whether a middleware hijacked, rendering
the `env#delete` option useless.

>
>> Cool. How would you suggest I check for TCP_INFO compatible platforms
>> in Unicorn? Is `RUBY_PLATFORM.ends_with?("linux".freeze)` sufficient
>> or do you prefer another mechanism? I agree that we should fall back
>> to the write hack on other platforms.
>
> The Raindrops::TCP_Info class should be undefined on unsupported
> platforms, so I think you can check for that, instead.  Then it
> should be transparent to add FreeBSD support from unicorn's
> perspective.

Perfect. I'll start working on a patch.

>
>
> (*) So I've been wondering if adding a "unicorn-mode" to an
> existing C10K, slow-client-capable Ruby/Rack server +
> reverse proxy makes sense for containerized deploys.
> Maybe...

I'd love to hear more about this idea. What are you contemplating?



[PATCH] oob_gc: rely on opt_aref_with optimization on Ruby 2.2+

2017-03-21 Thread Eric Wong
Maybe oob_gc probably isn't heavily used anymore, maybe
some Ruby 2.2+ users will benefit from this constant
reduction.

Followup-to: fb2f10e1d7a72e67 ("reduce constants and optimize for Ruby 2.2")
---
 lib/unicorn/oob_gc.rb | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/lib/unicorn/oob_gc.rb b/lib/unicorn/oob_gc.rb
index 5572e59..c4741a0 100644
--- a/lib/unicorn/oob_gc.rb
+++ b/lib/unicorn/oob_gc.rb
@@ -66,10 +66,9 @@ def self.new(app, interval = 5, path = %r{\A/})
   end
 
   #:stopdoc:
-  PATH_INFO = "PATH_INFO"
   def process_client(client)
 super(client) # Unicorn::HttpServer#process_client
-if OOBGC_PATH =~ OOBGC_ENV[PATH_INFO] && ((@@nr -= 1) <= 0)
+if OOBGC_PATH =~ OOBGC_ENV['PATH_INFO'] && ((@@nr -= 1) <= 0)
   @@nr = OOBGC_INTERVAL
   OOBGC_ENV.clear
   disabled = GC.enable
-- 
EW

--
unsubscribe: unicorn-public+unsubscr...@bogomips.org
archive: https://bogomips.org/unicorn-public/



Re: Patch: Add after_worker_exit configuration option

2017-03-21 Thread Eric Wong
Jeremy Evans  wrote:
> Here's a revised patch that should address the issues you identified:

Thanks.  Pushed as 2af91a1fef70d6546ee03760011c170a082db667

I needed one change to an integration test to workaround
the lack of reader attribute, but I'd rather pay the cost
in the test than the server, itself:

https://bogomips.org/unicorn.git/patch?id=2c6aa5878d052abb
("t/t0012-reload-empty-config.sh: access ivars directly if needed")
--
unsubscribe: unicorn-public+unsubscr...@bogomips.org
archive: https://bogomips.org/unicorn-public/



Re: [PATCH] Add worker_exec configuration option V2

2017-03-21 Thread Eric Wong
Jeremy Evans  wrote:
> On 03/10 09:19, Eric Wong wrote:
> > tests, later, or you can.  I also had some FreeBSD test fixes
> > (which might apply to OpenBSD) on a VM somewhere which I'll Cc:
> > you on: there was also just SO_KEEPALIVE fix I posted:
> > 
> >   https://bogomips.org/unicorn-public/20170310203431.28067-...@80x24.org/raw
> 
> The C test code also returns 8 on OpenBSD, FWIW.  I'm happy to test any
> test fixes on OpenBSD, just let me know.

Thanks. I'll be happy to help fix any OpenBSD failures you see
from "gmake check"  I don't think the accept_filter fixes apply
to OpenBSD, but I guess the expr(1) fix did.

Thank you.

Same goes for NetBSD, DragonflyBSD or any other completely
Free/Open Source OSes anybody else here uses.

> > Jeremy Evans  wrote:
> > > -  if pid = fork
> > > -@workers[pid] = worker
> > > -worker.atfork_parent
> > > +
> > > +  pid = if @worker_exec
> > > +worker_spawn(worker)
> > >else
> > > -after_fork_internal
> > > -worker_loop(worker)
> > > -exit
> > > +fork do
> > > +  after_fork_internal
> > > +  worker_loop(worker)
> > > +  exit
> > > +end
> > 
> > I prefer to avoid the block with fork.  The block deepens the
> > stack for the running app, so it can affect GC efficiency.
> > 
> > Can be fixed in a separate patch...
> 
> That makes sense.  If you would like me to send a separate patch to fix
> it, I can do that.

Yes, please.

Not sure if we should automate the stack depth test...
I resisted it in the past since it might be too fragile
w.r.t changes to Ruby (even using "unicorn" via the
RubyGems-generated wrapper deepens it by 2).

Thanks again.
--
unsubscribe: unicorn-public+unsubscr...@bogomips.org
archive: https://bogomips.org/unicorn-public/



Re: [PATCH] Add worker_exec configuration option V2

2017-03-21 Thread Eric Wong
Jeremy Evans  wrote:
> Here's a patch to fix the stack depth issue:

Thanks, applied with some trailing whitespace fixups.

(I suggest enabling the pre-commit hook distributed with
 git in $GIT_DIR/hooks/pre-commit, it'll run "git diff-index --check ..."
 to check that)

> From aa4914846a0870e5b01530a47d6dbfe09aeb39ce Mon Sep 17 00:00:00 2001

Btw, you can omit the above line and replace it with "---8<---"
(no quotes) so the applier can use "git am --scissors".

Thanks.
--
unsubscribe: unicorn-public+unsubscr...@bogomips.org
archive: https://bogomips.org/unicorn-public/



Re: [PATCH] check_client_connection: use tcp state on linux

2017-03-21 Thread Eric Wong
Awesome!  Thanks for the update.  I'll merge ccc-tcp-v3 into
master soon and push out 5.3.0-rc1

(Maybe there's need to quote at all, every message is archived
 forever in several public places:
  https://bogomips.org/unicorn-public/
  https://www.mail-archive.com/unicorn-public@bogomips.org
  maybe some others...)
--
unsubscribe: unicorn-public+unsubscr...@bogomips.org
archive: https://bogomips.org/unicorn-public/



Re: [PATCH] check_client_connection: use tcp state on linux

2017-03-21 Thread Simon Eskildsen
> I prefer we use a hash or case statement.  Both allow more
> optimization in the YARV VM of CRuby (opt_aref and
> opt_case_dispatch in insns.def).  case _might_ be a little
> faster if there's no constant lookup overhead, but
> a microbench or dumping the bytecode will be necessary
> to be sure :)
>
> A hash or a case can also help portability-wise in case
> we hit a system where these numbers are non-sequential;
> or if we forgot something.

Good point. I double checked all the states on Linux and found that we
were missing TCP_CLOSING [1] [2]. This is a state where the other side
is closed, and you have buffered data on your side. It doesn't seem
like this would ever happen in Unicorn, but I think we should include
it for completeness. This also means the range becomes non-sequential.
I looked at Illumus (solaris-derived) [3] and BSD [4] and for the TCP
states we're interested in it also appears to have a non-continues
range.

My co-worker, Kir Shatrov, benchmarked a bunch of approaches to the
state check and found that case is a good solution [5].  Due to the
realness of non-sequential states in common operating systems, I think
case is the way to go here as you suggested. I've made sure to
short-circuit the common-case of TCP_ESTABLISHED. I've only seen
CLOSE_WAIT in testing, but in the wild-life of large production scale
I would assume you would see TIME_WAIT and CLOSE. LAST_ACK_CLOSING it
seems pretty unlikely to hit, but not impossible. As with CLOSING,
I've included LAST_ACK_CLOSING for completeness.

[1] 
https://github.com/torvalds/linux/blob/5924bbecd0267d87c24110cbe2041b5075173a25/include/net/tcp_states.h#L27
[2] 
https://github.com/torvalds/linux/blob/ca78d3173cff3503bcd15723b049757f75762d15/net/ipv4/tcp.c#L228
[3] 
https://github.com/freebsd/freebsd/blob/386ddae58459341ec567604707805814a2128a57/sys/netinet/tcp_fsm.h
[4] 
https://github.com/illumos/illumos-gate/blob/f7877f5d39900cfd8b20dd673e5ccc1ef7cc7447/usr/src/uts/common/netinet/tcp_fsm.h
[5] https://gist.github.com/kirs/11ba4ce84c08188c9f7eba9c639616a5

> Yep, we need to account for the UNIX socket case.  I forget if
> kgio even makes them different...

I read the implementation and verified by dumping the class when
testing on some test boxes. You are right—it's a simple Kgio::Socket
object, not differentiating between Kgio::TCPSocket and
Kgio::UnixSocket at the class level. Kgio only does this if they're
explicitly passed to override the class returned from #try_accept.
Unicorn doesn't do this.

I've tried to find a way to determine the socket domain (INET vs.
UNIX) on the socket object, but neither Ruby's Socket class nor Kgio
seems to expose this. I'm not entirely sure what the simplest way to
do this check would be. We could have the accept loop pass the correct
class to #try_accept based on the listening socket that came back from
#accept. If we passed the listening socket to #read after accept, we'd
know.. but I don't like that the request knows about the listener
either. Alternatively, we could expose the socket domain in Kgio, but
that'll be problematic in the near-ish future as you've mentioned
wanting to move away from Kgio as Ruby's IO library is at parity as
per Ruby 2.4.

What do you suggest pursuing here to check whether the client socket
is a TCP socket?

Below is a patch addressing the other concerns. I had to include
require raindrops so the `defined?` check would do the right thing, as
the only other file that requires Raindrops is the worker one which is
loaded after http_request. I can change the load-order or require
raindrops in lib/unicorn.rb if you prefer.

Missing is the socket type check. Thanks for your feedback!

---
 lib/unicorn/http_request.rb | 31 ++-
 1 file changed, 26 insertions(+), 5 deletions(-)

diff --git a/lib/unicorn/http_request.rb b/lib/unicorn/http_request.rb
index 0c1f9bb..eedccac 100644
--- a/lib/unicorn/http_request.rb
+++ b/lib/unicorn/http_request.rb
@@ -2,6 +2,7 @@
 # :enddoc:
 # no stable API here
 require 'unicorn_http'
+require 'raindrops'

 # TODO: remove redundant names
 Unicorn.const_set(:HttpRequest, Unicorn::HttpParser)
@@ -29,6 +30,7 @@ class Unicorn::HttpParser
   # 2.2+ optimizes hash assignments when used with literal string keys
   HTTP_RESPONSE_START = [ 'HTTP', '/1.1 ']
   @@input_class = Unicorn::TeeInput
+  @@raindrops_tcp_info_defined = defined?(Raindrops::TCP_Info)
   @@check_client_connection = false

   def self.input_class
@@ -83,11 +85,7 @@ def read(socket)
   false until add_parse(socket.kgio_read!(16384))
 end

-# detect if the socket is valid by writing a partial response:
-if @@check_client_connection && headers?
-  self.response_start_sent = true
-  HTTP_RESPONSE_START.each { |c| socket.write(c) }
-end
+check_client_connection(socket) if @@check_client_connection

 e['rack.input'] = 0 == content_length ?
   NULL_IO : @@input_class.new(socket, self)
@@ -108,4 +106,27 @@ def call

Re: [PATCH] check_client_connection: use tcp state on linux

2017-03-21 Thread Eric Wong
Simon Eskildsen  wrote:
 
> On Sat, Feb 25, 2017 at 9:03 AM, Simon Eskildsen
>  wrote:
> > The implementation of the check_client_connection causing an early write is
> > ineffective when not performed on loopback. In my testing, when on 
> > non-loopback,
> > such as another host, we only see a 10-20% rejection rate with TCP_NODELAY 
> > of
> > clients that are closed. This means 90-80% of responses in this case are
> > rendered in vain.
> >
> > This patch uses getosockopt(2) with TCP_INFO on Linux to read the socket 
> > state.
> > If the socket is in a state where it doesn't take a response, we'll abort 
> > the
> > request with a similar error as to what write(2) would give us on a closed
> > socket in case of an error.
> >
> > In my testing, this has a 100% rejection rate. This testing was conducted 
> > with
> > the following script:

Good to know!

> > diff --git a/lib/unicorn/http_request.rb b/lib/unicorn/http_request.rb
> > index 0c1f9bb..b4c95fc 100644
> > --- a/lib/unicorn/http_request.rb
> > +++ b/lib/unicorn/http_request.rb
> > @@ -31,6 +31,9 @@ class Unicorn::HttpParser
> >@@input_class = Unicorn::TeeInput
> >@@check_client_connection = false
> >
> > +  # TCP_TIME_WAIT: 6, TCP_CLOSE: 7, TCP_CLOSE_WAIT: 8, TCP_LAST_ACK: 9
> > +  IGNORED_CHECK_CLIENT_SOCKET_STATES = (6..9)

Thanks for documenting the numbers.

I prefer we use a hash or case statement.  Both allow more
optimization in the YARV VM of CRuby (opt_aref and
opt_case_dispatch in insns.def).  case _might_ be a little
faster if there's no constant lookup overhead, but 
a microbench or dumping the bytecode will be necessary
to be sure :)

A hash or a case can also help portability-wise in case
we hit a system where these numbers are non-sequential;
or if we forgot something.

> > -# detect if the socket is valid by writing a partial response:
> > -if @@check_client_connection && headers?
> > -  self.response_start_sent = true
> > -  HTTP_RESPONSE_START.each { |c| socket.write(c) }
> > +# detect if the socket is valid by checking client connection.
> > +if @@check_client_connection

We can probably split everything inside this if to a new
method, this avoid penalizing people who don't use this feature.
Using check_client_connection will already have a high cost,
since it requires at least one extra syscall.

> > +  if defined?(Raindrops::TCP_Info)

...because defined? only needs to be checked once for the
lifetime of the process; we can define different methods
at load time to avoid the check for each and every request.

> > +tcp_info = Raindrops::TCP_Info.new(socket)
> > +if IGNORED_CHECK_CLIENT_SOCKET_STATES.cover?(tcp_info.state)
> > +  socket.close

Right, no need to socket.close, here; handle_error does it.

> > +  raise Errno::EPIPE

Since we don't print out the backtrace in handle_error, we
can raise without a backtrace to avoid excess garbage.

> > +end
> > +  elsif headers?
> > +self.response_start_sent = true
> > +HTTP_RESPONSE_START.each { |c| socket.write(c) }
> > +  end
> >  end

> I left out a TCPSocket check, we'll need a `socket.is_a?(TCPSocket)`
> in there. I'll wait with sending an updated patch in case you have
> other comments. I'm also not entirely sure we need the `socket.close`.
> What do you think?

Yep, we need to account for the UNIX socket case.  I forget if
kgio even makes them different...

Anyways I won't be online much for a few days, and it's the
weekend, so no rush :)

Thanks.
--
unsubscribe: unicorn-public+unsubscr...@bogomips.org
archive: https://bogomips.org/unicorn-public/



[PATCH] t0011-active-unix-socket.sh: fix race condition in test

2017-03-21 Thread Eric Wong
Killing the master process may lead to the worker dying on its
own (as designed); before kill(1) has had an opportunity to send
the second kill(2) syscall on the worker process.

Killing the worker before the master might also lead to a
needless respawn, so merely kill the master and let the worker
follow it in death.

This race condition occasionally caused test failures on slow,
uniprocessor hardware.
---
 t/t0011-active-unix-socket.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/t/t0011-active-unix-socket.sh b/t/t0011-active-unix-socket.sh
index d256f5c..fae0b6c 100755
--- a/t/t0011-active-unix-socket.sh
+++ b/t/t0011-active-unix-socket.sh
@@ -52,7 +52,7 @@ t_begin "worker pid unchanged (again)" && {
 }
 
 t_begin "nuking the existing Unicorn succeeds" && {
-   kill -9 $unicorn_pid $worker_pid
+   kill -9 $unicorn_pid
while kill -0 $unicorn_pid
do
sleep 1
-- 
EW

--
unsubscribe: unicorn-public+unsubscr...@bogomips.org
archive: https://bogomips.org/unicorn-public/



[PATCH] test_exec: SO_KEEPALIVE value only needs to be true

2017-03-21 Thread Eric Wong
On FreeBSD 10.3, the value of SO_KEEPALIVE returned by
getsockopt is 8, even when set to '1' via setsockopt.
Relax the test to only ensure the boolean value is
interpreted as "true".

Verified independently of Ruby using the following:
8<-
#include 
#include 
#include 

static int err(const char *msg)
{
perror(msg);
return 1;
}

int main(void)
{
int sv[2];
int set = 1;
int got;
socklen_t len = (socklen_t)sizeof(int);
int rc;

rc = socketpair(PF_LOCAL, SOCK_STREAM, 0, sv);
if (rc) return err("socketpair failed");

rc = setsockopt(sv[0], SOL_SOCKET, SO_KEEPALIVE, &set, len);
if (rc) return err("setsockopt failed");

rc = getsockopt(sv[0], SOL_SOCKET, SO_KEEPALIVE, &got, &len);
if (rc) return err("getsockopt failed");

printf("got: %d\n", got);
return 0;
}
---
 test/exec/test_exec.rb | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/test/exec/test_exec.rb b/test/exec/test_exec.rb
index ca0b7bc..37ba90f 100644
--- a/test/exec/test_exec.rb
+++ b/test/exec/test_exec.rb
@@ -142,7 +142,7 @@ def test_inherit_listener_unspecified
 res = hit(["http://#@addr:#@port/";])
 assert_equal [ "HI\n" ], res
 assert_shutdown(pid)
-assert_equal 1, sock.getsockopt(:SOL_SOCKET, :SO_KEEPALIVE).int,
+assert sock.getsockopt(:SOL_SOCKET, :SO_KEEPALIVE).bool,
 'unicorn should always set SO_KEEPALIVE on inherited sockets'
   ensure
 sock.close if sock
-- 
EW

--
unsubscribe: unicorn-public+unsubscr...@bogomips.org
archive: https://bogomips.org/unicorn-public/



Re: Patch: Add after_worker_ready configuration option

2017-03-21 Thread Jeremy Evans
On 02/23 09:23, Eric Wong wrote:
> Jeremy Evans  wrote:
> > On 02/23 02:32, Eric Wong wrote:
> > > Jeremy Evans  wrote:
> > > Anyways, I'm somewhat inclined to accept this as well, but will
> > > think about it a bit, too.
> > > 
> > > One problem I have with this change is that it requires users
> > > read more documentation and know more caveats to use.
> >  
> > It's unfortunate, but I think it's necessary in this case. Unicorn's
> > default behavior works for most users, but this makes it so you
> > don't have to override HttpServer#init_worker_process to get similar
> > functionality.
> 
> Yeah.  Unfortunately, with every option comes more
> support/maintenance overhead.  So I'm trying to make sure we've
> explored every possible avenue before adding this new option.

I agree.  If I knew of a way to get this functionality without
overriding a private method, I would use it.  I'm not sure there is a
way currently, which I why I worked on the patch.

> > > So I'm wondering if there can be some of that which could be
> > > trimmed out someday or made optional, somehow.
> > 
> > I'm not sure how open you are to a plugin-like system for Unicorn, where
> > you can have separate files for separate features, and users can choose
> > which features the want to use.  This limits memory usage so that you
> > don't pay a memory cost for features you don't use.  This is the
> > approach used by Sequel, Roda, and Rodauth (other libraries I maintain),
> > and if you are interested in a similar system for Unicorn, I would be
> > happy to work on one.
> 
> Not really.  As I've stated many times in the past, unicorn is
> just one of many servers available for Rack users; so I've been
> against people building too many dependencies on unicorn, that
> comes at the expense of other servers.
 
True, but there are features that only unicorn provides.  If you want
those features, you have to use unicorn anyway. 

> What I had in mind was having a shim process which does socket
> setup, user switching, etc. before exec-ing unicorn in cases
> where a systemd-like spawner is not used.  But it's probably too
> small to make an impact on real apps anyways, since Ruby still
> has tons of methods which never get called, either.

I could see that decreasing memory usage slightly, but I agree it's
unlikely to make a significant impact.

> > I do have a couple of additional patches I plan to send, but I think it
> > may be better to ask here first.
> > 
> > One is that SIGUSR1 handling doesn't work well with chroot if the log
> > file is outside of the chroot, since the worker process may not have
> > access to reopen the file.  One simple fix is to reopen the logs in
> > the master process, then send SIGQUIT instead of SIGUSR1 to the child
> > processes.  Are you open to a patch that does that?
> 
> Right now, the worker exits with some log spew if reopening
> logs fails, and the worker respawns anyways, correct?
> 
> Perhaps that mapping can be made automatic if chroot is
> enabled.  I wonder if that's too much magic...
> Or the log spew could be suppressed when the error
> is ENOENT and chroot is enabled.
> 
> The sysadmin could also send SIGHUP to reload everything
> and respawn all children.
> 
> So, I'm wondering if using SIGHUP is too much of a burden, or if
> automatic USR1 => QUIT mapping for chroot users is too magical.

SIGHUP is actually fine.  I didn't realize it reopened log files
in the master process, but apparently it does, so I can just use
SIGHUP instead of SIGUSR1.

> > The second feature is more exotic security feature called fork+exec.
> > The current issue with unicorn's direct forking of children is that
> > forked children share the memory layout of the server.  This is good
> > from a memory usage standpoint due to CoW memory.  However, if there
> > is a memory vulnerability in the unicorn application, this makes it
> > easier to exploit, since if the a unicorn worker process crashes, the
> > master process will fork a child with pretty much the same memory
> > layout.  This allows address space discovery attacks.  Using fork+exec,
> > each child process would have a unique address space for ASLR, on
> > most Unix-like systems (Linux, MacOS X, NetBSD, OpenBSD, Solaris).
> > There are additional security advantages from fork+exec on OpenBSD, but
> > all Unix-like systems that use ASLR can benefit from the unique
> > memory layout per process.
> 
> It depends how intrusive the implementation is...

That I don't know yet.  I haven't implemented fork+exec before, so I'm
not sure how many changes are needed. I probably won't be able to work
on this right away.

> Is this merely calling exec in the worker?
> 
> I'm not sure if your use of "fork+exec" is merely the two
> well-known syscalls combined, or if there's something else;
> I'll assume the former, since that's enough to have a
> unique address space for Ruby bytecode.
> 
> Can this be done in an after_fork/after_worker_ready hook?
> 
> Socket inherit

[PATCH] doc: remove private email support address

2017-03-21 Thread Eric Wong
Email was never private, and won't further burden myself or
any future maintainers with trying to maintain someone elses'
privacy.

Offering private support is also unfair to readers on public
lists who may get a watered down or improperly translated
summary (if at all).

Instead, encourage the use of anonymity tools and scrubbing of
sensitive information when the sender deems necessary.
---
 No, I don't do GPG, as I have no real identity :P

 .olddoc.yml |  1 -
 ISSUES  | 11 ++-
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/.olddoc.yml b/.olddoc.yml
index ee2d306..cacc0ab 100644
--- a/.olddoc.yml
+++ b/.olddoc.yml
@@ -12,7 +12,6 @@ noindex:
 - TODO
 - unicorn_rails_1
 public_email: unicorn-public@bogomips.org
-private_email: unic...@bogomips.org
 nntp_url:
   - nntp://news.public-inbox.org/inbox.comp.lang.ruby.unicorn
   - nntp://news.gmane.org/gmane.comp.lang.ruby.unicorn.general
diff --git a/ISSUES b/ISSUES
index 291441a..e06f9cd 100644
--- a/ISSUES
+++ b/ISSUES
@@ -9,14 +9,16 @@ submit patches and/or obtain support after you have searched 
the
 * Cc: all participants in a thread or commit, as subscription is optional
 * Do not {top post}[http://catb.org/jargon/html/T/top-post.html] in replies
 * Quote as little as possible of the message you're replying to
-* Do not send HTML mail or images, it will be flagged as spam
-* Anonymous and pseudonymous messages will always be welcome.
+* Do not send HTML mail or images,
+  they hurt reader privacy and will be flagged as spam
+* Anonymous and pseudonymous messages will ALWAYS be welcome
 * The email submission port (587) is enabled on the bogomips.org MX:
   https://bogomips.org/unicorn-public/20141004232241.ga23...@dcvr.yhbt.net/t/
 
 If your issue is of a sensitive nature or you're just shy in public,
-then feel free to email us privately at mailto:unic...@bogomips.org
-instead and your issue will be handled discreetly.
+use anonymity tools such as Tor or Mixmaster; and rely on the public
+mail archives for responses.  Be sure to scrub sensitive log messages
+and such.
 
 If you don't get a response within a few days, we may have forgotten
 about it so feel free to ask again.
@@ -64,7 +66,6 @@ document distributed with git) on guidelines for patch 
submission.
 == Contact Info
 
 * public: mailto:unicorn-public@bogomips.org
-* private: mailto:unic...@bogomips.org
 * nntp://news.gmane.org/gmane.comp.lang.ruby.unicorn.general
 * nntp://news.public-inbox.org/inbox.comp.lang.ruby.unicorn
 * https://bogomips.org/unicorn-public/
-- 
EW

--
unsubscribe: unicorn-public+unsubscr...@bogomips.org
archive: https://bogomips.org/unicorn-public/



[PATCH] unicorn_http: reduce rb_global_variable calls

2017-03-21 Thread Eric Wong
rb_global_variable registers the address of the variable which
refers to the object, instead of the object itself.  This adds
extra overhead to each global variable for our case, where the
variable is frozen and never changed.

Given there are currently 59 elements in this array, this saves
58 singly-linked list entries and associated malloc calls and
associated overhead in the current mainline Ruby 2.x
implementation.  On 64-bit GNU libc malloc, this is already 16 *
58 = 928 bytes; more than the extra object slot and array slack
space used by the new mark array.

Mainline Ruby 1.9+ currently has a rb_gc_register_mark_object
public function which would suite our needs, too, but it is
currently undocumented, and may not be available in the future.
---
 ext/unicorn_http/common_field_optimization.h |  4 ++--
 ext/unicorn_http/global_variables.h  |  4 ++--
 ext/unicorn_http/httpdate.c  |  4 ++--
 ext/unicorn_http/unicorn_http.rl | 13 +
 4 files changed, 15 insertions(+), 10 deletions(-)

diff --git a/ext/unicorn_http/common_field_optimization.h 
b/ext/unicorn_http/common_field_optimization.h
index 42c5430..251e734 100644
--- a/ext/unicorn_http/common_field_optimization.h
+++ b/ext/unicorn_http/common_field_optimization.h
@@ -60,7 +60,7 @@ static struct common_field common_http_fields[] = {
 #define HTTP_PREFIX_LEN (sizeof(HTTP_PREFIX) - 1)
 
 /* this function is not performance-critical, called only at load time */
-static void init_common_fields(void)
+static void init_common_fields(VALUE mark_ary)
 {
   int i;
   struct common_field *cf = common_http_fields;
@@ -77,7 +77,7 @@ static void init_common_fields(void)
   cf->value = rb_str_new(tmp, HTTP_PREFIX_LEN + cf->len);
 }
 cf->value = rb_obj_freeze(cf->value);
-rb_global_variable(&cf->value);
+rb_ary_push(mark_ary, cf->value);
   }
 }
 
diff --git a/ext/unicorn_http/global_variables.h 
b/ext/unicorn_http/global_variables.h
index e1c43c9..c17ee6a 100644
--- a/ext/unicorn_http/global_variables.h
+++ b/ext/unicorn_http/global_variables.h
@@ -56,7 +56,7 @@ NORETURN(static void parser_raise(VALUE klass, const char *));
 /** Defines global strings in the init method. */
 #define DEF_GLOBAL(N, val) do { \
   g_##N = rb_obj_freeze(rb_str_new(val, sizeof(val) - 1)); \
-  rb_global_variable(&g_##N); \
+  rb_ary_push(mark_ary, g_##N); \
 } while (0)
 
 /* Defines the maximum allowed lengths for various input elements.*/
@@ -67,7 +67,7 @@ DEF_MAX_LENGTH(FRAGMENT, 1024); /* Don't know if this length 
is specified somewh
 DEF_MAX_LENGTH(REQUEST_PATH, 4096); /* common PATH_MAX on modern systems */
 DEF_MAX_LENGTH(QUERY_STRING, (1024 * 10));
 
-static void init_globals(void)
+static void init_globals(VALUE mark_ary)
 {
   DEF_GLOBAL(rack_url_scheme, "rack.url_scheme");
   DEF_GLOBAL(request_method, "REQUEST_METHOD");
diff --git a/ext/unicorn_http/httpdate.c b/ext/unicorn_http/httpdate.c
index 0a1045f..2381cff 100644
--- a/ext/unicorn_http/httpdate.c
+++ b/ext/unicorn_http/httpdate.c
@@ -64,13 +64,13 @@ static VALUE httpdate(VALUE self)
return buf;
 }
 
-void init_unicorn_httpdate(void)
+void init_unicorn_httpdate(VALUE mark_ary)
 {
VALUE mod = rb_define_module("Unicorn");
mod = rb_define_module_under(mod, "HttpResponse");
 
buf = rb_str_new(0, buf_capa - 1);
-   rb_global_variable(&buf);
+   rb_ary_push(mark_ary, buf);
buf_ptr = RSTRING_PTR(buf);
httpdate(Qnil);
 
diff --git a/ext/unicorn_http/unicorn_http.rl b/ext/unicorn_http/unicorn_http.rl
index 957a5e3..6fc3498 100644
--- a/ext/unicorn_http/unicorn_http.rl
+++ b/ext/unicorn_http/unicorn_http.rl
@@ -13,7 +13,7 @@
 #include "global_variables.h"
 #include "c_util.h"
 
-void init_unicorn_httpdate(void);
+void init_unicorn_httpdate(VALUE mark_ary);
 
 #define UH_FL_CHUNKED  0x1
 #define UH_FL_HASBODY  0x2
@@ -917,8 +917,10 @@ static VALUE HttpParser_rssget(VALUE self)
 
 void Init_unicorn_http(void)
 {
+  static VALUE mark_ary;
   VALUE mUnicorn, cHttpParser;
 
+  mark_ary = rb_ary_new();
   mUnicorn = rb_define_module("Unicorn");
   cHttpParser = rb_define_class_under(mUnicorn, "HttpParser", rb_cObject);
   eHttpParserError =
@@ -928,7 +930,7 @@ void Init_unicorn_http(void)
   e414 = rb_define_class_under(mUnicorn, "RequestURITooLongError",
eHttpParserError);
 
-  init_globals();
+  init_globals(mark_ary);
   rb_define_alloc_func(cHttpParser, HttpParser_alloc);
   rb_define_method(cHttpParser, "initialize", HttpParser_init, 0);
   rb_define_method(cHttpParser, "clear", HttpParser_clear, 0);
@@ -964,14 +966,17 @@ void Init_unicorn_http(void)
 
   rb_define_singleton_method(cHttpParser, "max_header_len=", set_maxhdrlen, 1);
 
-  init_common_fields();
+  init_common_fields(mark_ary);
   SET_GLOBAL(g_http_host, "HOST");
   SET_GLOBAL(g_http_trailer, "TRAILER");
   SET_GLOBAL(g_http_transfer_encoding, "TRANSFER_ENCODING");
   SET_GLOBAL(g_content_length, "CONTENT_LENGTH");

Rack::Request#params EOFError

2017-03-21 Thread John Smart
I found an interesting bug today.  When sending a form multipart >
112KB, I noticed that Rack::Request#params was empty. This only occurs
when using Unicorn, and does not occur when using Thin.

I dug into Rack source and noticed that any EOFError raised while
reading the body input stream will cause the post body input stream to
be ignored:

https://github.com/rack/rack/blob/cabaa58fe6ac355623746e287475af88c9395d66/lib/rack/request.rb#L357

When multipart > 112KB, I noticed that Unicorn tees the input stream
to a temporary file.  I was wondering: might Unicorn::TeeInput raise
an EOFError as part of normal operation when reaching the end of the
input stream?  If so, this would break the Rack spec.  I only tested
this on Darwin, still working on a self-contained repro.

I ended up raising client_body_buffer_size to 1MB as a work-around.
I'm wondering if this is a bug?



Re: [PATCH] Add worker_exec configuration option V2

2017-03-21 Thread Jeremy Evans
On 03/10 09:19, Eric Wong wrote:
> Jeremy Evans  wrote:
> > Here's V2 of the patch, which I think should address all of the issues
> > you pointed out.
> 
> Thanks.  Pushed to git://80x24.org/unicorn worker_exec I can add
> tests, later, or you can.  I also had some FreeBSD test fixes
> (which might apply to OpenBSD) on a VM somewhere which I'll Cc:
> you on: there was also just SO_KEEPALIVE fix I posted:
> 
>   https://bogomips.org/unicorn-public/20170310203431.28067-...@80x24.org/raw

The C test code also returns 8 on OpenBSD, FWIW.  I'm happy to test any
test fixes on OpenBSD, just let me know.

> 
> >  worker_nr = -1
> >  until (worker_nr += 1) == @worker_processes
> >@workers.value?(worker_nr) and next
> >worker = Unicorn::Worker.new(worker_nr)
> >before_fork.call(self, worker)
> > -  if pid = fork
> > -@workers[pid] = worker
> > -worker.atfork_parent
> > +
> > +  pid = if @worker_exec
> > +worker_spawn(worker)
> >else
> > -after_fork_internal
> > -worker_loop(worker)
> > -exit
> > +fork do
> > +  after_fork_internal
> > +  worker_loop(worker)
> > +  exit
> > +end
> 
> I prefer to avoid the block with fork.  The block deepens the
> stack for the running app, so it can affect GC efficiency.
> 
> Can be fixed in a separate patch...

That makes sense.  If you would like me to send a separate patch to fix
it, I can do that.

Thanks,
Jeremy
--
unsubscribe: unicorn-public+unsubscr...@bogomips.org
archive: https://bogomips.org/unicorn-public/



Re: [PATCH (ccc)] http_request: support proposed Raindrops::TCP states on

2017-03-21 Thread Simon Eskildsen
This looks good to me.

Only question is, why not make Raindrops 0.18+ a requirement to avoid
the Raindrops.const_defined?(:TCP)?



Re: [PATCH] Add worker_exec configuration option

2017-03-21 Thread Jeremy Evans
On 03/08 08:02, Eric Wong wrote:
> Jeremy Evans  wrote:
> > The worker_exec configuration option makes all worker processes
> > exec after forking.  This initializes the worker processes with
> > separate memory layouts, defeating address space discovery
> > attacks on operating systems supporting address space layout
> > randomization, such as Linux, MacOS X, NetBSD, OpenBSD, and
> > Solaris.
> > 
> > Support for execing workers is very similar to support for reexecing
> > the master process.  The main difference is the worker's to_i and
> > master pipes also need to be inherited after worker exec just as the
> > listening sockets need to be inherited after reexec.
> 
> Thanks, this seems like an acceptable feature.



Eric,

Thanks for your detailed review.  I will work on an updated patch and
try to send it tomorrow or Friday.

Thanks,
Jeremy
--
unsubscribe: unicorn-public+unsubscr...@bogomips.org
archive: https://bogomips.org/unicorn-public/



Re: Patch: Add after_worker_exit configuration option

2017-03-21 Thread Eric Wong
Jeremy Evans  wrote:
> This option can be used to implement custom behavior for handling
> worker exits.  For example, let's say you have a specific request
> that crashes a worker process, which you expect to be due to a
> improperly programmed C extension. By modifying your worker to
> save request related data in a temporary file and using this option,
> you can get a record of what request is crashing the application,
> which will make debugging easier.
> 
> This is not a complete patch as it doesn't include tests, but
> before writing tests I wanted to see if this is something you'd
> consider including in unicorn.

What advantage does this have over Ruby's builtin at_exit?

AFAIK, at_exit may be registered inside after_fork hooks to the
same effect.

Thanks.
--
unsubscribe: unicorn-public+unsubscr...@bogomips.org
archive: https://bogomips.org/unicorn-public/



[PATCH] input: update documentation and hide internals.

2017-03-21 Thread Eric Wong
rack 2.x exists nowadays still allows rewindable input as an
option, and we will still enable it by default to avoid breaking
any existing code.

Hide the internal documentation since we do not want people
depending on unicorn internals; so there's no reason to confuse
or overwhelm people with documentation about it.  Arguably,
TeeInput and StreamInput should not be documented publically at
all, but I guess that ship has long sailed...
---
 lib/unicorn/configurator.rb | 13 ++---
 lib/unicorn/stream_input.rb |  9 +
 lib/unicorn/tee_input.rb| 14 +++---
 3 files changed, 18 insertions(+), 18 deletions(-)

diff --git a/lib/unicorn/configurator.rb b/lib/unicorn/configurator.rb
index 7ed5ffa..9f7f56f 100644
--- a/lib/unicorn/configurator.rb
+++ b/lib/unicorn/configurator.rb
@@ -55,7 +55,7 @@ class Unicorn::Configurator
 :pid => nil,
 :preload_app => false,
 :check_client_connection => false,
-:rewindable_input => true, # for Rack 2.x: (Rack::VERSION[0] <= 1),
+:rewindable_input => true,
 :client_body_buffer_size => Unicorn::Const::MAX_BODY,
   }
   #:startdoc:
@@ -505,13 +505,12 @@ def preload_app(bool)
   # Disabling rewindability can improve performance by lowering
   # I/O and memory usage for applications that accept uploads.
   # Keep in mind that the Rack 1.x spec requires
-  # \env[\"rack.input\"] to be rewindable, so this allows
-  # intentionally violating the current Rack 1.x spec.
+  # \env[\"rack.input\"] to be rewindable,
+  # but the Rack 2.x spec does not.
   #
-  # +rewindable_input+ defaults to +true+ when used with Rack 1.x for
-  # Rack conformance.  When Rack 2.x is finalized, this will most
-  # likely default to +false+ while still conforming to the newer
-  # (less demanding) spec.
+  # +rewindable_input+ defaults to +true+ for compatibility.
+  # Setting it to +false+ may be safe for applications and
+  # frameworks developed for Rack 2.x and later.
   def rewindable_input(bool)
 set_bool(:rewindable_input, bool)
   end
diff --git a/lib/unicorn/stream_input.rb b/lib/unicorn/stream_input.rb
index de5aeea..41d28a0 100644
--- a/lib/unicorn/stream_input.rb
+++ b/lib/unicorn/stream_input.rb
@@ -1,16 +1,17 @@
 # -*- encoding: binary -*-
 
-# When processing uploads, Unicorn may expose a StreamInput object under
-# "rack.input" of the (future) Rack (2.x) environment.
+# When processing uploads, unicorn may expose a StreamInput object under
+# "rack.input" of the Rack environment when
+# Unicorn::Configurator#rewindable_input is set to +false+
 class Unicorn::StreamInput
   # The I/O chunk size (in +bytes+) for I/O operations where
   # the size cannot be user-specified when a method is called.
   # The default is 16 kilobytes.
-  @@io_chunk_size = Unicorn::Const::CHUNK_SIZE
+  @@io_chunk_size = Unicorn::Const::CHUNK_SIZE # :nodoc:
 
   # Initializes a new StreamInput object.  You normally do not have to call
   # this unless you are writing an HTTP server.
-  def initialize(socket, request)
+  def initialize(socket, request) # :nodoc:
 @chunked = request.content_length.nil?
 @socket = socket
 @parser = request
diff --git a/lib/unicorn/tee_input.rb b/lib/unicorn/tee_input.rb
index 6f66162..2ccc2d9 100644
--- a/lib/unicorn/tee_input.rb
+++ b/lib/unicorn/tee_input.rb
@@ -1,6 +1,6 @@
 # -*- encoding: binary -*-
 
-# acts like tee(1) on an input input to provide a input-like stream
+# Acts like tee(1) on an input input to provide a input-like stream
 # while providing rewindable semantics through a File/StringIO backing
 # store.  On the first pass, the input is only read on demand so your
 # Rack application can use input notification (upload progress and
@@ -9,22 +9,22 @@
 # strict interpretation of Rack::Lint::InputWrapper functionality and
 # will not support any deviations from it.
 #
-# When processing uploads, Unicorn exposes a TeeInput object under
-# "rack.input" of the Rack environment.
+# When processing uploads, unicorn exposes a TeeInput object under
+# "rack.input" of the Rack environment by default.
 class Unicorn::TeeInput < Unicorn::StreamInput
   # The maximum size (in +bytes+) to buffer in memory before
   # resorting to a temporary file.  Default is 112 kilobytes.
-  @@client_body_buffer_size = Unicorn::Const::MAX_BODY
+  @@client_body_buffer_size = Unicorn::Const::MAX_BODY # :nodoc:
 
   # sets the maximum size of request bodies to buffer in memory,
   # amounts larger than this are buffered to the filesystem
-  def self.client_body_buffer_size=(bytes)
+  def self.client_body_buffer_size=(bytes) # :nodoc:
 @@client_body_buffer_size = bytes
   end
 
   # returns the maximum size of request bodies to buffer in memory,
   # amounts larger than this are buffered to the filesystem
-  def self.client_body_buffer_size
+  def self.client_body_buffer_size # :nodoc:
 @@client_body_buffer_size
   end
 
@@ -37,7 +37,7 @@ def new_tmpio # :nodoc:
 
   # Initializes a new TeeInput object.  You normally do not have to ca

Re: [PATCH (ccc)] http_request: support proposed Raindrops::TCP states on

2017-03-21 Thread Eric Wong
Simon Eskildsen  wrote:
> This looks good to me.

Thanks for taking a look.

> Only question is, why not make Raindrops 0.18+ a requirement to avoid
> the Raindrops.const_defined?(:TCP)?

I'd rather have some wiggle room in case problems are found on
either side; so people can rollback one without affecting the
other(*).

It also relaxes things for distro packagers for coordinating
releases; in case there's other dependencies (whether human or
technical) which slow down the process.


(*) We may drop raindrops as a hard requirement for unicorn at
some point, too.  The intended use case for counter sharing
across hundredes/thousands of workers on massively multicore
CPUs never surfaced.
--
unsubscribe: unicorn-public+unsubscr...@bogomips.org
archive: https://bogomips.org/unicorn-public/



Patch: Add after_worker_ready configuration option V2

2017-03-21 Thread Jeremy Evans
Here's V2 of the after_worker_ready patch.  This adds some more
documentation, and tries to give a better description of the
advantages in the commit message.

>From cbc6fe845ade8946b7db2ecd2b86a0bd8e18bbb8 Mon Sep 17 00:00:00 2001
From: Jeremy Evans 
Date: Tue, 21 Feb 2017 16:33:09 -0800
Subject: [PATCH] Add after_worker_ready configuration option

This adds a hook that is called after the application has
been loaded by the worker process, directly before it starts
accepting requests.  This hook is necessary if your application
needs to gain access to resources during initialization,
and then drop privileges before serving requests.

This is especially useful in conjunction with chroot support
so the app can load all the normal ruby libraries it needs
to function, and then chroot before accepting requests.

If you are preloading the app, it's possible to drop privileges
or chroot in after_fork, but if you are not preloading the app,
the only way to currently do this is to override the private
HttpServer#init_worker_process method, and overriding private
methods is a recipe for future breakage if the internals are
modified.  This hook allows for such functionality to be
supported and not break in future versions of Unicorn.
---
 lib/unicorn/configurator.rb | 22 ++
 lib/unicorn/http_server.rb  |  4 ++--
 2 files changed, 24 insertions(+), 2 deletions(-)

diff --git a/lib/unicorn/configurator.rb b/lib/unicorn/configurator.rb
index 1e2b6e4..7ed5ffa 100644
--- a/lib/unicorn/configurator.rb
+++ b/lib/unicorn/configurator.rb
@@ -49,6 +49,9 @@ class Unicorn::Configurator
   server.logger.error(m)
 end
   },
+:after_worker_ready => lambda { |server, worker|
+server.logger.info("worker=#{worker.nr} ready")
+  },
 :pid => nil,
 :preload_app => false,
 :check_client_connection => false,
@@ -172,6 +175,21 @@ def after_worker_exit(*args, &block)
 set_hook(:after_worker_exit, block_given? ? block : args[0], 3)
   end
 
+  # sets after_worker_ready hook to a given block.  This block will be called
+  # by a worker process after it has been fully loaded, directly before it
+  # starts responding to requests:
+  #
+  #  after_worker_ready do |server,worker|
+  #server.logger.info("worker #{worker.nr} ready, dropping privileges")
+  #worker.user('username', 'groupname')
+  #  end
+  #
+  # Do not use Configurator#user if you rely on changing users in the
+  # after_worker_ready hook.
+  def after_worker_ready(*args, &block)
+set_hook(:after_worker_ready, block_given? ? block : args[0])
+  end
+
   # sets before_fork got be a given Proc object.  This Proc
   # object will be called by the master process before forking
   # each worker.
@@ -569,6 +587,10 @@ def working_directory(path)
   # This switch will occur after calling the after_fork hook, and only
   # if the Worker#user method is not called in the after_fork hook
   # +group+ is optional and will not change if unspecified.
+  #
+  # Do not use Configurator#user if you rely on changing users in the
+  # after_worker_ready hook.  Instead, you need to call Worker#user
+  # directly in after_worker_ready.
   def user(user, group = nil)
 # raises ArgumentError on invalid user/group
 Etc.getpwnam(user)
diff --git a/lib/unicorn/http_server.rb b/lib/unicorn/http_server.rb
index c2086cb..ef897ad 100644
--- a/lib/unicorn/http_server.rb
+++ b/lib/unicorn/http_server.rb
@@ -15,7 +15,7 @@ class Unicorn::HttpServer
 :before_fork, :after_fork, :before_exec,
 :listener_opts, :preload_app,
 :orig_app, :config, :ready_pipe, :user
-  attr_writer   :after_worker_exit
+  attr_writer   :after_worker_exit, :after_worker_ready
 
   attr_reader :pid, :logger
   include Unicorn::SocketHelper
@@ -644,7 +644,7 @@ def worker_loop(worker)
 trap(:USR1) { nr = -65536 }
 
 ready = readers.dup
-@logger.info "worker=#{worker.nr} ready"
+@after_worker_ready.call(self, worker)
 
 begin
   nr < 0 and reopen_worker_logs(worker.nr)
-- 
2.11.0

--
unsubscribe: unicorn-public+unsubscr...@bogomips.org
archive: https://bogomips.org/unicorn-public/



Re: [PATCH] check_client_connection: use tcp state on linux

2017-03-21 Thread Eric Wong
> Simon Eskildsen  wrote:
> > +  tcp_info = Raindrops::TCP_Info.new(socket)
> > +  raise Errno::EPIPE, "client closed connection".freeze, [] if
> > closed_state?(tcp_info.state)

Also, I guess if you're using this frequently, it might make
sense to keep the tcp_info object around and recycle it
using the Raindrops::TCP_Info#get! method.

#get! has always been supported in raindrops, but I just noticed
RDoc wasn't picking it up properly :x

Anyways I've fixed the documentation over on the raindrops list

  https://bogomips.org/raindrops-public/20170301025541.26183-...@80x24.org/
  ("[PATCH] ext: fix documentation for C ext-defined classes")

  https://bogomips.org/raindrops-public/20170301025546.26233-...@80x24.org/
  ("[PATCH] TCP_Info: custom documentation for #get!")

... and updated the RDoc on https://bogomips.org/raindrops/

Heck, I wonder if it even makes sense to reuse a frozen empty
Array when raising the exception...
--
unsubscribe: unicorn-public+unsubscr...@bogomips.org
archive: https://bogomips.org/unicorn-public/



Re: [PATCH] check_client_connection: use tcp state on linux

2017-03-21 Thread Eric Wong
Simon Eskildsen  wrote:
> Here's another update Eric!

Thanks, a few teeny issues fixed up locally (but commented
inline, below).

However, there's a bigger problem which I'm Cc:-ing Aman about
for tmm1/gctools changing process_client in the internal API.

I won't burden him maintaining extra versions, nor will I force
users to use a certain version of unicorn or gctools to match.

Aman: for reference, relevant messages in the archives:

https://bogomips.org/unicorn-public/?q=d:20170222-+check_client_connection&x=t

(TL;DR: I don't expect Aman will have to do anything,
 just keeping him in the loop)

> +++ b/lib/unicorn/http_server.rb
> @@ -558,8 +558,8 @@ def e100_response_write(client, env)
> 
># once a client is accepted, it is processed in its entirety here
># in 3 easy steps: read request, call app, write app response
> -  def process_client(client)
> -status, headers, body = @app.call(env = @request.read(client))
> +  def process_client(client, listener)
> +status, headers, body = @app.call(env = @request.read(client, listener))

I can squash this fix in, locally:

diff --git a/lib/unicorn/oob_gc.rb b/lib/unicorn/oob_gc.rb
index 5572e59..74a1d51 100644
--- a/lib/unicorn/oob_gc.rb
+++ b/lib/unicorn/oob_gc.rb
@@ -67,8 +67,8 @@ def self.new(app, interval = 5, path = %r{\A/})
 
   #:stopdoc:
   PATH_INFO = "PATH_INFO"
-  def process_client(client)
-super(client) # Unicorn::HttpServer#process_client
+  def process_client(client, listener)
+super(client, listener) # Unicorn::HttpServer#process_client
 if OOBGC_PATH =~ OOBGC_ENV[PATH_INFO] && ((@@nr -= 1) <= 0)
   @@nr = OOBGC_INTERVAL
   OOBGC_ENV.clear

However, https://github.com/tmm1/gctools also depends on this
undocumented internal API of ours; and I will not consider
breaking it for release...

Pushed to the "ccc-tcp" branch @ git://bogomips.org/unicorn
(commit beaee769c6553bf4e0260be2507b8235f0aa764f)

>  begin
>return if @request.hijacked?
> @@ -655,7 +655,7 @@ def worker_loop(worker)
>  # Unicorn::Worker#kgio_tryaccept is not like accept(2) at all,
>  # but that will return false
>  if client = sock.kgio_tryaccept
> -  process_client(client)
> +  process_client(client, sock)
>nr += 1
>worker.tick = time_now.to_i
>  end


> @@ -28,6 +29,7 @@ class Unicorn::HttpParser
># Drop these frozen strings when Ruby 2.2 becomes more prevalent,
># 2.2+ optimizes hash assignments when used with literal string keys
>HTTP_RESPONSE_START = [ 'HTTP', '/1.1 ']
> +  EMPTY_ARRAY = [].freeze

(minor) this was made before commit e06b699683738f22
("http_request: freeze constant strings passed IO#write")
but I've trivially fixed it up, locally.

> +def check_client_connection(socket, listener) # :nodoc:
> +  if Kgio::TCPServer === listener
> +@@tcp_info ||= Raindrops::TCP_Info.new(socket)
> +@@tcp_info.get!(socket)
> +raise Errno::EPIPE, "client closed connection".freeze,
> EMPTY_ARRAY if closed_state?(@@tcp_info.state)

(minor) I needed to wrap that line since I use gigantic fonts
(fixed up locally)
--
unsubscribe: unicorn-public+unsubscr...@bogomips.org
archive: https://bogomips.org/unicorn-public/



Re: check_client_connection using getsockopt(2)

2017-03-21 Thread Simon Eskildsen
On Wed, Feb 22, 2017 at 1:33 PM, Eric Wong  wrote:
> Simon Eskildsen  wrote:
>
>  great to know it's still working after all these years :>
>
>> This confirms Eric's comment that the existing
>> `check_client_connection` works perfectly on loopback, but as soon as
>> you put an actual network between the Unicorn and client—it's only
>> effective 20% of the time, even with `TCP_NODELAY`. I'm assuming, due
>> to buffering, even when disabling Nagle's. As we're changing our
>> architecture, we move from ngx (lb) -> ngx (host) -> unicorn to ngx
>> (lb) -> unicorn. That means this patch will no longer work for us.
>
> Side comment: I'm a bit curious how you guys arrived at removing
> nginx at the host level (if you're allowed to share that info)
>
> I've mainly kept nginx on the same host as unicorn, but used
> haproxy or similar (with minimal buffering) at the LB level.
> That allows better filesystem load distribution for large uploads.


Absolutely. We're starting to experiment with Kubernetes, and a result
we're interested in simplifying ingress as much as possible. We could
still run them, but if I can avoid explaining why they're there for
the next 5 years—I'll be happy :) As I see, the current use-cases we
have for the host nginx are (with why we can get rid of it):

* Keepalive between edge nginx LBs and host LBs to avoid excessive
network traffic. This is not a deal-breaker, so we'll just ignore this
problem. It's a _massive_ amount of traffic normally though.
* Out of rotation. We take nodes gracefully out of rotation by
touching a file that'll return 404 on a health-checking endpoint from
the edge LBs. Kubernetes implements this by just removing all the
containers on that node.
* Graceful deploys. When we deploy with containers, we take each
container out of rotation with the local host Nginx, wait for it to
come up, and put it back in rotation. We don't utilize Unicorns
graceful restarts because we want a Ruby upgrade deploy (replacing the
container) to be the same as an updated code rollout.
* File uploads. As you mention, currently we load-balance this between
them. I haven't finished the investigation on whether this is feasible
on the front LBs. Otherwise we may go for a 2-tier Nginx solution or
expand the edge. However, with Kubernetes I'd really like to avoid
having host nginxes. It's not very native to the Kubernetes paradigm.
Does it balance across the network, or only to the pods on that node?
* check_client_connection working. :-) This thread.

Slow clients and other advantages we find from the edge Nginxes.

>> I propose instead of the early `write(2)` hack, that we use
>> `getsockopt(2)` with the `TCP_INFO` flag and read the state of the
>> socket. If it's in `CLOSE_WAIT` or `CLOSE`, we kill the connection and
>> move on to the next.
>>
>> https://github.com/torvalds/linux/blob/8fa3b6f9392bf6d90cb7b908e07bd90166639f0a/include/uapi/linux/tcp.h#L163
>>
>> This is not going to be portable, but we can do this on only Linux
>> which I suspect is where most production deployments of Unicorn that
>> would benefit from this feature run. It's not useful in development
>> (which is likely more common to not be Linux). We could also introduce
>> it under a new configuration option if desired. In my testing, this
>> works to reject 100% of requests early when not on loopback.
>
> Good to know about it's effectiveness!  I don't mind adding
> Linux-only features as long as existing functionality still
> works on the server-oriented *BSDs.
>
>> The code is essentially:
>>)?
>> def client_closed?
>>   tcp_info = socket.getsockopt(Socket::SOL_TCP, Socket::TCP_INFO)
>>   state = tcp_info.unpack("c")[0]
>>   state == TCP_CLOSE || state == TCP_CLOSE_WAIT
>> end
>
> +Cc: raindrops-pub...@bogomips.org -> https://bogomips.org/raindrops-public/
>
> Fwiw, raindrops (already a dependency of unicorn) has TCP_INFO
> support under Linux:
>
> https://bogomips.org/raindrops.git/tree/ext/raindrops/linux_tcp_info.c
>
> I haven't used it, much, but FreeBSD also implements a subset of
> this feature, nowadays, and ought to be supportable, too.  We
> can look into adding it for raindrops.

Cool, I think it makes sense to use Raindrops here, advantage being
it'd use the actual struct instead of a sketchy `#unpack`.

I meant to ask, in Raindrops why do you use the netlink API to get the
socket backlog instead of `getsockopt(2)` with `TCP_INFO` to get
`tcpi_unacked`? (as described in
http://www.ryanfrantz.com/posts/apache-tcp-backlog/) We use this to
monitor socket backlogs with a sidekick Ruby daemon. Although we're
looking to replace it with a middleware to simplify for Kubernetes.
It's one of our main metrics for monitoring performance, especially
around deploys.

>
> I don't know about other BSDs...
>
>> This could be called at the top of `#process_client` in `http_server.rb`.
>>
>> Would there be interest in this upstream? Any comments on this
>> proposed implementation? Currently, we're using a middleware with the
>> Ra

Re: [PATCH] Add worker_exec configuration option V2

2017-03-21 Thread Jeremy Evans
On 03/11 07:18, Eric Wong wrote:
> Jeremy Evans  wrote:
> > On 03/10 09:19, Eric Wong wrote:
> > > Jeremy Evans  wrote:
> > > > -  if pid = fork
> > > > -@workers[pid] = worker
> > > > -worker.atfork_parent
> > > > +
> > > > +  pid = if @worker_exec
> > > > +worker_spawn(worker)
> > > >else
> > > > -after_fork_internal
> > > > -worker_loop(worker)
> > > > -exit
> > > > +fork do
> > > > +  after_fork_internal
> > > > +  worker_loop(worker)
> > > > +  exit
> > > > +end
> > > 
> > > I prefer to avoid the block with fork.  The block deepens the
> > > stack for the running app, so it can affect GC efficiency.
> > > 
> > > Can be fixed in a separate patch...
> > 
> > That makes sense.  If you would like me to send a separate patch to fix
> > it, I can do that.
> 
> Yes, please.
> 
> Not sure if we should automate the stack depth test...
> I resisted it in the past since it might be too fragile
> w.r.t changes to Ruby (even using "unicorn" via the
> RubyGems-generated wrapper deepens it by 2).

Here's a patch to fix the stack depth issue:

>From aa4914846a0870e5b01530a47d6dbfe09aeb39ce Mon Sep 17 00:00:00 2001
From: Jeremy Evans 
Date: Mon, 13 Mar 2017 08:28:54 -0700
Subject: [PATCH] Don't pass a block for fork when forking workers

This reduces the stack depth, making GC more efficient.
---
 lib/unicorn/http_server.rb | 14 ++
 1 file changed, 6 insertions(+), 8 deletions(-)

diff --git a/lib/unicorn/http_server.rb b/lib/unicorn/http_server.rb
index a5bd2c4..023df10 100644
--- a/lib/unicorn/http_server.rb
+++ b/lib/unicorn/http_server.rb
@@ -541,14 +541,12 @@ def spawn_missing_workers
   worker = Unicorn::Worker.new(worker_nr)
   before_fork.call(self, worker)
 
-  pid = if @worker_exec
-worker_spawn(worker)
-  else
-fork do
-  after_fork_internal
-  worker_loop(worker)
-  exit
-end
+  pid = @worker_exec ?  worker_spawn(worker) : fork
+  
+  unless pid
+after_fork_internal
+worker_loop(worker)
+exit
   end
 
   @workers[pid] = worker
-- 
2.11.0

--
unsubscribe: unicorn-public+unsubscr...@bogomips.org
archive: https://bogomips.org/unicorn-public/



[PATCH] check_client_connection: use tcp state on linux

2017-03-21 Thread Simon Eskildsen
The implementation of the check_client_connection causing an early write is
ineffective when not performed on loopback. In my testing, when on non-loopback,
such as another host, we only see a 10-20% rejection rate with TCP_NODELAY of
clients that are closed. This means 90-80% of responses in this case are
rendered in vain.

This patch uses getosockopt(2) with TCP_INFO on Linux to read the socket state.
If the socket is in a state where it doesn't take a response, we'll abort the
request with a similar error as to what write(2) would give us on a closed
socket in case of an error.

In my testing, this has a 100% rejection rate. This testing was conducted with
the following script:

100.times do |i|
  client = TCPSocket.new("some-unicorn", 20_000)
  client.write("GET /collections/#{rand(1)}
HTTP/1.1\r\nHost:walrusser.myshopify.com\r\n\r\n")
  client.close
end
---
 lib/unicorn/http_request.rb | 19 +++
 1 file changed, 15 insertions(+), 4 deletions(-)

diff --git a/lib/unicorn/http_request.rb b/lib/unicorn/http_request.rb
index 0c1f9bb..b4c95fc 100644
--- a/lib/unicorn/http_request.rb
+++ b/lib/unicorn/http_request.rb
@@ -31,6 +31,9 @@ class Unicorn::HttpParser
   @@input_class = Unicorn::TeeInput
   @@check_client_connection = false

+  # TCP_TIME_WAIT: 6, TCP_CLOSE: 7, TCP_CLOSE_WAIT: 8, TCP_LAST_ACK: 9
+  IGNORED_CHECK_CLIENT_SOCKET_STATES = (6..9)
+
   def self.input_class
 @@input_class
   end
@@ -83,10 +86,18 @@ def read(socket)
   false until add_parse(socket.kgio_read!(16384))
 end

-# detect if the socket is valid by writing a partial response:
-if @@check_client_connection && headers?
-  self.response_start_sent = true
-  HTTP_RESPONSE_START.each { |c| socket.write(c) }
+# detect if the socket is valid by checking client connection.
+if @@check_client_connection
+  if defined?(Raindrops::TCP_Info)
+tcp_info = Raindrops::TCP_Info.new(socket)
+if IGNORED_CHECK_CLIENT_SOCKET_STATES.cover?(tcp_info.state)
+  socket.close
+  raise Errno::EPIPE
+end
+  elsif headers?
+self.response_start_sent = true
+HTTP_RESPONSE_START.each { |c| socket.write(c) }
+  end
 end

 e['rack.input'] = 0 == content_length ?
-- 
2.11.0



Re: check_client_connection using getsockopt(2)

2017-03-21 Thread Eric Wong
Simon Eskildsen  wrote:

 Thanks for the writeup!

Another sidenote: It seems nginx <-> unicorn is a bit odd
for deployment in a containerized environment(*).

> I meant to ask, in Raindrops why do you use the netlink API to get the
> socket backlog instead of `getsockopt(2)` with `TCP_INFO` to get
> `tcpi_unacked`? (as described in
> http://www.ryanfrantz.com/posts/apache-tcp-backlog/) We use this to
> monitor socket backlogs with a sidekick Ruby daemon. Although we're
> looking to replace it with a middleware to simplify for Kubernetes.
> It's one of our main metrics for monitoring performance, especially
> around deploys.

The netlink API allows independently-spawned processes to
monitor others; so it can be system-wide.  TCP_INFO requires the
process doing the checking to also have the socket open.

I guess this reasoning for using netlink is invalid for containers,
though...

> I was going to use `env["unicorn.socket"]`/`env["puma.socket"]`, but
> you could also do `env.delete("hijack_io")` after hijacking to allow
> Unicorn to still render the response. Unfortunately the
> `.socket` key is not part of the Rack standard, so I'm
> hesitant to use it. When this gets into Unicorn I'm planning to
> propose it upstream to Puma as well.

I was going to say env.delete("rack.hijack_io") is dangerous
(since env could be copied by middleware); but apparently not:
rack.hijack won't work with a copied env, either.
You only need to delete with the same env object you call
rack.hijack with.

But calling rack.hijack followed by env.delete may still
have unintended side-effects in other servers; so I guess
we (again) cannot rely on hijack working portably.

> Cool. How would you suggest I check for TCP_INFO compatible platforms
> in Unicorn? Is `RUBY_PLATFORM.ends_with?("linux".freeze)` sufficient
> or do you prefer another mechanism? I agree that we should fall back
> to the write hack on other platforms.

The Raindrops::TCP_Info class should be undefined on unsupported
platforms, so I think you can check for that, instead.  Then it
should be transparent to add FreeBSD support from unicorn's
perspective.


(*) So I've been wondering if adding a "unicorn-mode" to an
existing C10K, slow-client-capable Ruby/Rack server +
reverse proxy makes sense for containerized deploys.
Maybe...
--
unsubscribe: unicorn-public+unsubscr...@bogomips.org
archive: https://bogomips.org/unicorn-public/



Re: [PATCH] check_client_connection: use tcp state on linux

2017-03-21 Thread Eric Wong
Eric Wong  wrote:
> Simon Eskildsen  wrote:
> > @@ -28,6 +29,7 @@ class Unicorn::HttpParser
> ># Drop these frozen strings when Ruby 2.2 becomes more prevalent,
> ># 2.2+ optimizes hash assignments when used with literal string keys
> >HTTP_RESPONSE_START = [ 'HTTP', '/1.1 ']
> > +  EMPTY_ARRAY = [].freeze
> 
> (minor) this was made before commit e06b699683738f22
> ("http_request: freeze constant strings passed IO#write")
> but I've trivially fixed it up, locally.

And I actually screwed it up, pushed out ccc-tcp-v2 branch
with that fix squashed in :x

Writing tests, now...
--
unsubscribe: unicorn-public+unsubscr...@bogomips.org
archive: https://bogomips.org/unicorn-public/



Re: Patch: Add support for chroot to Worker#user

2017-03-21 Thread Jeremy Evans
On 02/21 07:53, Eric Wong wrote:
> Jeremy Evans  wrote:
> > Any chrooting would need to happen inside Worker#user, because
> > you can't chroot until after you have parsed the list of groups,
> > and you must chroot before dropping root privileges.
> > 
> > chroot is an important security feature, so that if the unicorn
> > process is exploited, file system access is limited to the chroot
> > directory instead of the entire system.
> 
> I'm hesitant to document this as "an important security feature".
> 
> Perhaps "an extra layer of security" is more accurate,
> as there are ways of breaking out of a chroot.

That's fine.  Note that while breaking out of a chroot is easy if you
have root privileges, it is difficult to break out of a chroot
if root privileges have been dropped.  There are a couple of
possibilities listed at https://github.com/earthquake/chw00t, neither of
which is likely in environments where unicorn is typically deployed.

> 
> > This is not a complete patch as it does not include tests. I can
> > add tests if you would consider accepting this feature.
> 
> I wouldn't worry about automated tests if elevated privileges
> are required.
> 
> > -  # Changes the worker process to the specified +user+ and +group+
> > +  # Changes the worker process to the specified +user+ and +group+,
> > +  # and chroots to the current working directory if +chroot+ is set.
> ># This is only intended to be called from within the worker
> ># process from the +after_fork+ hook.  This should be called in
> ># the +after_fork+ hook after any privileged functions need to be
> > @@ -123,7 +124,7 @@ def close # :nodoc:
> ># directly back to the caller (usually the +after_fork+ hook.
> ># These errors commonly include ArgumentError for specifying an
> ># invalid user/group and Errno::EPERM for insufficient privileges
> > -  def user(user, group = nil)
> > +  def user(user, group = nil, chroot = false)
> >  # we do not protect the caller, checking Process.euid == 0 is
> >  # insufficient because modern systems have fine-grained
> >  # capabilities.  Let the caller handle any and all errors.
> > @@ -134,6 +135,10 @@ def user(user, group = nil)
> >Process.initgroups(user, gid)
> >Process::GID.change_privilege(gid)
> >  end
> > +if chroot
> > +  Dir.chroot(Dir.pwd)
> > +  Dir.chdir('/')
> > +end
> 
> Perhaps this can be made more flexible by allowing chroot to
> any specified directory, not just pwd.  Something like:
> 
> chroot = Dir.pwd if chroot == true
> if chroot
>   Dir.chroot(chroot)
>   Dir.chdir('/')
> end
> 
> Anyways I'm inclined to accept this feature.

I had basically the same code originally, but wanted to minimize the API
surface to increase the likelihood for acceptance.  So I'm definitely in
favor of that change.

Thanks,
Jeremy

--
unsubscribe: unicorn-public+unsubscr...@bogomips.org
archive: https://bogomips.org/unicorn-public/