On Sun, Aug 28, 2016 at 8:04 PM, Craig Russell <[email protected]> wrote:
> Can you please take a look and see why the rescue didn’t work?

Logs can be found here:

https://whimsy.apache.org/members/log/

In particular, https://whimsy.apache.org/members/log/whimsy_error.log

What I am still seeing is:

_ERROR #<Encoding::UndefinedConversionError: "\\xE2" from ASCII-8BIT
to UTF-8>, referer:
https://whimsy.apache.org/secretary/workbench/file.cgi

And further up the stack traceback:

_WARN   
/usr/local/rvm/gems/ruby-2.3.1/gems/mail-2.6.4/lib/mail/message.rb:1887:in
`to_s', referer:
https://whimsy.apache.org/secretary/workbench/file.cgi
_WARN   /x1/srv/whimsy/www/secretary/workbench/file.cgi:318:in `block
in send_email', referer:
https://whimsy.apache.org/secretary/workbench/file.cgi

So, you are not hitting the exception handler, and you are dying later
when trying to convert the message (which includes a binary subject)
into a string.

The reason why you are not hitting the exception handler is that you
are not calling force_encoding.  A second problem is that if an
exception were to be raised, you wouldn't be catching it as the
exception needs to be qualified: Encoding::UndefinedConversionError

> Thanks,
>
> Craig

- Sam Ruby

>> On Aug 28, 2016, at 4:30 PM, Sam Ruby <[email protected]> wrote:
>>
>> On Sun, Aug 28, 2016 at 6:15 PM, Craig Russell <[email protected]> 
>> wrote:
>>> I’m blind here. I can’t see the pending.yml. I can’t see the error console. 
>>> I don’t even know if my change was pushed to production.
>>>
>>> What tools do I need to see what’s going on?
>>
>> What code is actually deployed can be seen on the last two lines of
>> the status page: https://whimsy.apache.org/status/
>>
>> Nothing in the (current) workbench shows the raw contents of
>> pending.yml.  It would be easy to add as a new CGI script.  It could
>> even be added as a new action in file.cgi.
>>
>> Alternately, we could ask for you to be added to have shell access to
>> whimsy-vm3.
>>
>>> Thanks,
>>>
>>> Craig
>>
>> - Sam Ruby
>>
>>>> On Aug 28, 2016, at 2:30 PM, Craig Russell <[email protected]> 
>>>> wrote:
>>>>
>>>>>
>>>>> On Aug 28, 2016, at 6:04 AM, Sam Ruby <[email protected]> wrote:
>>>>>
>>>>> On Sat, Aug 27, 2016 at 11:23 PM, Craig Russell
>>>>> <[email protected]> wrote:
>>>>>> The processing of email::subject seems to be localized to file.cgi ca. 
>>>>>> 261
>>>>>>
>>>>>>        # override subject?
>>>>>>        if vars.email_subject and !vars.email_subject.empty?
>>>>>>          if vars.email_subject =~ /^re:\s/i
>>>>>>            subject vars.email_subject
>>>>>>          else
>>>>>>            subject 'Re: ' + vars.email_subject
>>>>>>          end
>>>>>>        end
>>>>>>
>>>>>> I can’t see where the actual problem is, but is there a way to either;
>>>>>>
>>>>>> 1. have whichever component created vars.email_subject recognize UTF-8 
>>>>>> characters and pass them as characters instead of binary
>>>>>>
>>>>>> 2. recognize that this has happened here and replace the subject with an 
>>>>>> innocuous subject based on the document type.
>>>>>
>>>>> All of your analysis seems to be on target.
>>>>>
>>>>> This is from the log:
>>>>>
>>>>> [Sat Aug 27 18:36:03.233539 2016] [cgi:error] [pid 3570:tid
>>>>> 139833343252224] [client 73.15.26.163:62667] AH01215: _ERROR
>>>>> #<Encoding::UndefinedConversionError: "\\xE2" from ASCII-8BIT to
>>>>> UTF-8>, referer:
>>>>> https://whimsy.apache.org/secretary/workbench/file.cgi
>>>>>
>>>>> Looking at pending.yml with the interactive ruby shell:
>>>>>
>>>>> $ irb
>>>>> irb(main):001:0> require 'yaml'
>>>>> => true
>>>>> irb(main):002:0> pending = YAML.load_file('pending.yml')
>>>>> => [{"doctype"=>"icla",
>>>>> "source"=>"Gosha-Arinich-me-goshakkk.name--icla.pdf",
>>>>> "realname"=>"Heorhi Arynich", "pubname"=>"Gosha Arinich",
>>>>> "email"=>"[email protected]", "filename"=>"heorhi-arynich.pdf",
>>>>> "nname"=>"Gosha Arinich", "nemail"=>"[email protected]",
>>>>> "iname"=>"Gosha Arinich", "iemail"=>"[email protected]",
>>>>> "uname"=>"Gosha Arinich", "uemail"=>"[email protected]",
>>>>> "pname"=>"Gosha Arinich", "pemail"=>"[email protected]",
>>>>> "memail"=>"[email protected]", "gname"=>"Gosha Arinich",
>>>>> "gemail"=>"[email protected]", "contact"=>"Gosha Arinich",
>>>>> "cemail"=>"[email protected]", "ipodling"=>" ",
>>>>> "email:addr"=>"[email protected]",
>>>>> "email:id"=>"<ca+ttpjt-+d5_o4uqksv+1dbs_fafwfy4zrmtjspxey48ae3...@mail.gmail.com>",
>>>>> "email:name"=>"Gosha Arinich", "email:subject"=>"ICLA \xE2\x80\x94
>>>>> Gosha Arinich aka goshakkk", "svn:mime-type"=>"application/pdf"}]
>>>>> irb(main):003:0> pending.first['email:subject']
>>>>> => "ICLA \xE2\x80\x94 Gosha Arinich aka goshakkk"
>>>>> irb(main):004:0> pending.first['email:subject'].force_encoding('utf-8')
>>>>> => "ICLA — Gosha Arinich aka goshakkk"
>>>>>
>>>>> Not surprising given the torturous path that the subject goes through
>>>>> in the current workbench implementation.  A cron job extracts the
>>>>> subject line from the email using python libraries and puts it into a
>>>>> svn property associated with the file.  The workbench then uses the
>>>>> command line to extract that property and parses the output from the
>>>>> command.  What is surprising is that if there is an error in handling
>>>>> non-ASCII characters why it hasn't shown up before and more
>>>>> frequently.  I'm pretty sure that non-ASCII characters have been seen
>>>>> before, and I'm not sure what is different about this email.
>>>>
>>>> I’ve seen plenty of non-ASCII characters but this is the first I’ve seen 
>>>> one in the triple-character UTF8 representation.
>>>>>
>>>>> In any case, suggested fixes:
>>>>>
>>>>> 1) add "'vars.email_subject.force_encoding('utf-8') if
>>>>> vars.email_subject.encoding == Encoding::BINARY" before the inner if
>>>>> statement.  It should be harmless in cases that currently work, and
>>>>> should fix this case.  In cases where the data is binary data that
>>>>> can't be interpreted as utf-8, it will continue to blow up.
>>>>>
>>>>> 2) add 'begin...rescue...end' around the inner if statement.  Note:
>>>>> you don't need to set subject in the rescue clause as it was set by
>>>>> the relevant erb file (e.g. icla.erb).  More information on rescue
>>>>> statements: http://phrogz.net/programmingruby/tut_exceptions.html
>>>>>
>>>>> These changes should enable you to process the currently pending action.
>>>>
>>>> Now waiting for deployment…
>>>>
>>>> Craig
>>>>
>>>>>
>>>>>> Craig
>>>>>
>>>>> - Sam Ruby
>>>>>
>>>>>>> On Aug 27, 2016, at 12:11 PM, Craig Russell <[email protected]> 
>>>>>>> wrote:
>>>>>>>
>>>>>>> Here’s what happens to the em-dash in whimsy pending.yml:
>>>>>>>
>>>>>>> ---
>>>>>>> - doctype: icla
>>>>>>> source: craig-russell-copy.pdf
>>>>>>> realname: Craig Russell Emdash
>>>>>>> pubname: Craig Russell Emdash
>>>>>>> email: [email protected]
>>>>>>> filename: craig-russell-emdash.pdf
>>>>>>> nname: Craig Russell
>>>>>>> nemail: [email protected]
>>>>>>> iname: Craig Russell
>>>>>>> iemail: [email protected]
>>>>>>> uname: Craig Russell
>>>>>>> uemail: [email protected]
>>>>>>> pname: Craig Russell
>>>>>>> pemail: [email protected]
>>>>>>> memail: [email protected]
>>>>>>> gname: Craig Russell
>>>>>>> gemail: [email protected]
>>>>>>> contact: Craig Russell
>>>>>>> cemail: [email protected]
>>>>>>> ipodling: " "
>>>>>>> email:addr: [email protected]
>>>>>>> email:id: "<[email protected]>"
>>>>>>> email:name: Craig Russell
>>>>>>> email:subject: !binary |-
>>>>>>> RU0gZGFzaCBjYXVzZXMgdHJvdWJsZSDigJQg
>>>>>>> svn:mime-type: application/pdf
>>>>>>>
>>>>>>>
>>>>>>>> On Aug 27, 2016, at 11:41 AM, Craig Russell <[email protected]> 
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>> This email causes (still pending email) an error sending mail.
>>>>>>>>
>>>>>>>> I suspect it is because of the em-dash in the subject.
>>>>>>>>
>>>>>>>> I don’t know how to look at or edit the pending.yml on the server.
>>>>>>>>
>>>>>>>> From: Gosha Arinich <[email protected] <mailto:[email protected]>>
>>>>>>>> Date: Sat, 27 Aug 2016 03:03:00 +0300
>>>>>>>> Message-ID: 
>>>>>>>> <ca+ttpjt-+d5_o4uqksv+1dbs_fafwfy4zrmtjspxey48ae3...@mail.gmail.com 
>>>>>>>> <mailto:ca+ttpjt-+d5_o4uqksv+1dbs_fafwfy4zrmtjspxey48ae3...@mail.gmail.com>>
>>>>>>>> Subject: =?UTF-8?Q?ICLA_=E2=80=94_Gosha_Arinich_aka_goshakkk?=
>>>>>>>> To: [email protected] <mailto:[email protected]>
>>>>>>>>
>>>>>>>> So, two issues: the pending mail needs to be sent; the bug needs to be 
>>>>>>>> fixed.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>>
>>>>>>>> Craig
>>>>>>>>
>>>>>>>>> Begin forwarded message:
>>>>>>>>>
>>>>>>>>> From: Gosha Arinich <[email protected] <mailto:[email protected]>>
>>>>>>>>> Subject: ICLA — Gosha Arinich aka goshakkk
>>>>>>>>> Date: August 26, 2016 at 5:03:00 PM PDT
>>>>>>>>> To: [email protected] <mailto:[email protected]>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Cheers,
>>>>>>>>> Gosha
>>>>>>>>>
>>>>>>>>
>>>>>>>> Craig L Russell
>>>>>>>> Secretary, Apache Software Foundation
>>>>>>>> [email protected] <mailto:[email protected]> http://db.apache.org/jdo 
>>>>>>>> <http://db.apache.org/jdo>
>>>>>>>
>>>>>>> Craig L Russell
>>>>>>> Architect
>>>>>>> [email protected]
>>>>>>> P.S <mailto:[email protected]>. A good JDO? O, Gasp!
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>> Craig L Russell
>>>>>> Architect
>>>>>> [email protected]
>>>>>> P.S. A good JDO? O, Gasp!
>>>>
>>>> Craig L Russell
>>>> Architect
>>>> [email protected]
>>>> P.S. A good JDO? O, Gasp!
>>>
>>> Craig L Russell
>>> Architect
>>> [email protected]
>>> P.S. A good JDO? O, Gasp!
>>>
>>>
>>>
>>>
>>>
>
> Craig L Russell
> Architect
> [email protected]
> P.S. A good JDO? O, Gasp!
>
>
>
>
>

Reply via email to