How do I unsubscribe from this email list?

On Sat, Aug 2, 2014 at 12:51 AM, Gary Grossman <[email protected]>
wrote:

> Hi Eric,
>
> I work with Michael, and this discussion sure got off on the
> wrong foot... we love unicorn and use it heavily, and just
> want to contribute back to it.
>
> To detail the encoding problem we were trying to fix, unicorn
> uses rb_str_new in several places to create Ruby strings.
> For Ruby 1.9 and later, these strings are assigned ASCII-8BIT
> encoding.
>
> While the Rack specification doesn't dictate what encoding
> should be used for strings in the environment, many
> developers would probably expect the default external encoding
> setting in Encoding.default_external to be used.
>
> Many Rails applications use UTF8 heavily. The use of ASCII-8BIT
> in the env can lead to Encoding::CompatibilityErrors being
> raised when a UTF8 string and ASCII-8BIT string are concatenated,
> which happens frequently when properties like request.url are
> referenced in erb templates. To get around these problems,
> an app would have to force encoding on the strings in the env
> manually. It seems a shame to do this in slower Ruby code when
> it could be done up front by unicorn.
>
> We'd like to propose that unicorn use rb_external_str_new to
> make strings instead of rb_str_new.
>
> Perhaps you have your reasons for continuing to use rb_str_new
> but we figured we'd run this by you.
>
> Here's a proposed patch.
>
> Gary
>
> From befb01530c8d930ba53cc58b979ddf42a4c12565 Mon Sep 17 00:00:00 2001
> From: Gary Grossman <[email protected]>
> Date: Sat, 2 Aug 2014 00:19:30 -0700
> Subject: [PATCH] If unicorn is used with Ruby 1.9 or later, use
>  rb_external_str_new instead of rb_str_new to create strings. The resulting
>  strings will use the default external encoding. Continue using rb_str_new
> for
>  older versions of Ruby.
>
> Using the default external encoding instead of ASCII-8BIT for
> strings is more in line with developer expectations and will cause
> less unexpected bugs such as Encoding::CompatibilityErrors which
> result when, say, a UTF8 string and ASCII-8BIT string are
> concatenated together.
>
> Added a unit test to ensure that strings returned in the Rack
> environment conform to the default external encoding.
> ---
>  ext/unicorn_http/ext_help.h |  6 ++++++
>  test/unit/test_request.rb   | 13 +++++++++++++
>  2 files changed, 19 insertions(+)
>
> diff --git a/ext/unicorn_http/ext_help.h b/ext/unicorn_http/ext_help.h
> index c87c272..6806f8e 100644
> --- a/ext/unicorn_http/ext_help.h
> +++ b/ext/unicorn_http/ext_help.h
> @@ -79,4 +79,10 @@ static int str_cstr_case_eq(VALUE val, const char *ptr,
> long len)
>  #define STR_CSTR_CASE_EQ(val, const_str) \
>    str_cstr_case_eq(val, const_str, sizeof(const_str) - 1)
>
> +#ifdef HAVE_RUBY_ENCODING_H
> +/* Use default external encoding for strings for Ruby 1.9+,
> + * fall back to rb_str_new when unavailable */
> +#define rb_str_new rb_external_str_new
> +#endif
> +
>  #endif /* ext_help_h */
> diff --git a/test/unit/test_request.rb b/test/unit/test_request.rb
> index fbda1a2..0a105e0 100644
> --- a/test/unit/test_request.rb
> +++ b/test/unit/test_request.rb
> @@ -179,4 +179,17 @@ class RequestTest < Test::Unit::TestCase
>      env['rack.input'].rewind
>      res = @lint.call(env)
>    end
> +
> +  def test_encoding
> +    if ''.respond_to?(:encoding)
> +      client = MockRequest.new("GET http://e:3/x?y=z HTTP/1.1\r\n" \
> +                               "Host: foo\r\n\r\n")
> +      env = @request.read(client)
> +      encoding = Encoding.default_external
> +      assert_equal encoding, env['REQUEST_PATH'].encoding
> +      assert_equal encoding, env['PATH_INFO'].encoding
> +      assert_equal encoding, env['QUERY_STRING'].encoding
> +    end
> +  end
> +
>  end
> --
> 1.9.1
>
>
>


-- 
Kapil


Reply via email to