Greetings, LyX Land:

I've encouraged people to learn to use LyX, so when they run into trouble,
I feel responsible to try and help.  I use Linux to prepare documents, so
I have not experienced this problem before.  Many people still use Windows
and MS word and such, and so they do things that I would not expect, and
I am frustrated when these things arise.

I think the question I need to ask you is this: How can I find out what encoding
is currently used in the LyX document and what should it be to make it
work properly?
And how can I wrestle all of the characters into the correct encoding? Is there
no magic want to scan a lyx text file and change everything to a desired
encoding?

Here's the long version:

A student has LyX documents have lots and lots of
invalid characters.  I'm virtually certain most of these were inserted
into LyX by
a Copy & Paste from MS Word and/or Adobe Acrobat. In all of the places
where Word used an apostrophe, we seem to have an illegal character.  I
think quotation marks as well. Probably other characters. I'm pretty sure the
quotation marks and apostrophe problems result from Word's use of "smart
quotes" by default.

I wondered if we shouldn't open the LyX document in Emacs and then search
and replace the bad characters.   If I knew how to insert characters that LyX
would accept, I would do that.

I think she has a lot of the same trouble with her Bibliography, which
is a bib file
exported from Zotero.  I have had the problem in my own work that Zotero will
export unexpected encodings, such as the long dash in place of -- in
page numbers.
But in the student's document, all of the dates of the citations show
up as ????
when LaTeX processes the document.

So, how to fix this up?

First, How should she configure "Document Settings/ Language"?

She's from South East Asia, but writing in English.  So perhaps her PC
has more international language features than I'm used to.  For LyX
language encoding, "default" is not good?  How about utf8?
Or one of the other unicode options.

Incidentally, LyX has the Font button to select XeTeX, supported fonts.
why doesn't that fix the encoding problem?  A font selection is not the same as
encoding?

Second, we need to force the document to use only the desired encoding.
It is a bit outside my comprehension that a document would allow one to
paste in an invalid character, but that's just me.

But isn't there a way to convert the characters in one command?

In Linux, I'd try a program like "iconv", if I had a good guess for what
the "from" encoding should be.

I'd appreciate any advice that I can assemble and pass along to the
students.

I expect that this hassle will end up discouraging everybody and they
revert back to MS Word.


-- 
Paul E. Johnson
Professor, Political Science    Assoc. Director
1541 Lilac Lane, Room 504     Center for Research Methods
University of Kansas               University of Kansas
http://pj.freefaculty.org            http://quant.ku.edu

Reply via email to