On Wed, 20 Sep 2017, Owen Jacobson wrote:
> On Sep 19, 2017, at 12:39 AM, Kerim Aydin <[email protected]> wrote:
> > On Tue, 19 Sep 2017, VJ Rada wrote:
> >> Lots of ?s that should be 's
> >
> > Oh, $##!@$#!@. I spent a lot of time replacing smart quotes in lots of
> > peoples' judgements and missed those (that was like 75% of the editing
> > effort). thx.
>
> I would love to help out. I have a fairly solid understanding of Unicode, and
> of RFC 2822 and friends. I suspect curly quotes (U+2018, U+2019, and
> friends),
> at least, are inevitable, and that any preference any of us may have for a
> seven-bit textual universe are probably futile. I’m very much against
> actually
> using Emoji and other astral-plane characters in Agora, but it should at
> least be possible.
>
> In my case, curly apostrophes and quotes are inserted for me automatically by
> nearly every text-editing affordance in my OS - macOS’ typographical heritage
> shows through strong. It’s only when I author posts in a programming editor
> (TextMate,
> as it happens) and copy-paste them into my mail that they come out with ASCII
> quotes
> (U+0027).
>
> I know you’ve spent considerable effort trying to make Unicode work in your
> CFJ system,
> and in fixing up encoding issues by hand after the fact. The chain of
> programs you use
> to manage the CFJ archives _should_ probably handle non-ASCII characters
> transparently,
> even if it presently doesn’t. Can I share the load?
Maybe! Problem is, there's one weak link in my toolchain and I don't have
control of
my server to find alternatives, maybe you have some suggestions.
Here's my workflow:
1. Do preliminary editing/entry in a text editor and save file to server. All
the
tools I use are very good with Unicode and no format is lost here.
2. Use PHP to make nicely formatted cases, build the index, etc. No problems
here
either! If you look on the website all the cases have their nice Unicode
preserved.
3. Use the PHP mail utility to format and mail the case. THIS IS WHERE IT ALL
BREAKS.
Every PHP (and server CLI) mailing tool to mail a text file from the server
loses
the Unicode and mails in ASCII. I've spent a long time googling mail and
sendmail.
I've added hand-crafted headers. Checked file formats. Used command line
switches.
All that. No dice. Unicode lost in the mail.
4. So instead of emailing directly to Official, the PHP code emails it to
myself, where
it shows up with all the Unicode as ????, then I use my email client to paste
in and fix
the Unicode and then forward to Official (my email client also has no
problems). This
is where I miss stuff, doing it by hand.
So, the question is, how to get a file, addressable by web, e.g. the raw file
here:
https://faculty.washington.edu/kerim/nomic/cases/3536
into email, with the single touch of a button, while automatically formatting
Subject
lines and doing other automatic things important to keeping the records
straight
(e.g. comparing to hashes so I know if a case has been changed since last
posted),
while preserving Unicode?
Important note is I have no control over software versions on this server and
the mail and sendmail versions/flavors seem to lack some options that are
supposed to
fix this.