pursuant to my last long winded email.

I've boiled it down to something very simple

The company doing our ePubs is mixing

1) unicode HTML dec entities for diacriticals in the IAST roman char range
2) Unicode UTF strings for Tamil and Devanagari script

and

3) old fashioned punctuation in the range of ISO-8859-1 <http://www.alanwood.net/demos/charsetdiffs.html#a> 128-255. ( I think... )

1 and 2 above work fine, mixing #3 into the text blocks does, for mysterious reasons I cannot fathom (I wonder if it show correctly on Windows) show correctly in a browser.

http://dev.himalayanacademy.com/book/dancing-with-siva/7

the curly apostrophe in "Sampradaya's" and the mdash on the last line "souls—he who..." are rendered in the browser I guess browser are built to properly render mixed unicode and ISO 8859-1 text in the same block...just guessing...


BUT! they turn into garbage if saved copied out of a LC field and pasted into a MySQL column/field meanwhile all the Unicode (both html entities and UTF strings) are rendered correctly and preserved across all "transport agents" -- GET POST via an api.php then processed in stacks with LC JSON encoded array etc. but not any character in the range of 127-255...

dunno why, but at least we know what to do... not use any ANSI! Unicode all the way...


It's as simple as:

command getContent pBookPart
   put the uBookFilesLocation of this stack into tPath
   put url  ("file:" & tPath & "/ops/xhtml/" & pBookPart ) into tText

# Fix ANSI chars first
   replace "—" with "&#8212;" in tText
     replace "’" with "&#8217;" in tText

#unicode all the way!
     put uniDecode(uniEncode(tText,"UTF8")) into tText
      set the htmlText of fld "CurrentChapterText" to tText

end getContent

Now, the content in the field can be cut pasted and move to MySQL and the quote and mdashes all work... hurray...


Swasti Astu, Be Well!
Brahmanathaswami

Kauai's Hindu Monastery
www.HimalayanAcademy.com



Mark Schonewille wrote:
Hi Brahmanathaswami,

This works on LC 6.7.3:

on mouseUp
     put fld 1 into x
     if the platform is not "MacOS" then
          // not sure why this works
          put isoToMac(x) into x
     end if
     put uniDecode(uniEncode(x,"UTF8")) into x
     set the htmlText of fld 2 to x
end mouseUp

--
Best regards,

Mark Schonewille

Economy-x-Talk Consulting and Software Engineering
Homepage: http://economy-x-talk.com
Twitter: http://twitter.com/xtalkprogrammer
KvK: 50277553

Installer Maker for LiveCode:
http://qery.us/468

Buy my new book "Programming LiveCode for the Real Beginner" http://qery.us/3fi

LiveCode on Facebook:
https://www.facebook.com/groups/runrev/

On 7/26/2015 21:31, Brahmanathaswami wrote:
We do a lot of work with the contents of ePubs. For those who don't know
the spec:

"someBook.epub" is just "someBook.zip"

which when inflated has a mini-portable web site based on responsive CSS
(all percentages). You get

someBook
/ops # "Open Package Structure"
/ fonts
/ images
/ styles
/ xhtml
toc.ncx

The xhtml folder then has all the these files:
ch09_05_b.html
ch09_05_c.html
ch09_06.html

etc.

The text is pretty advanced in the sense that it uses unicode... (I
think) for rendering diacritical fonts. mdash's etc.

If I simply import the raw file unprocessed into a LC field (7.0.5)... I
get the usual, expected text:

<h3 class="h3s"><samp>Is Monistic Theism Found in the <span
class="cmitalic"><samp>Vedas?</samp></span></samp></h3>
<h4 class="h4"><samp><span class="smallcap"><samp>&#346;LOKA
145</samp></span></samp></h4>
<p class="noindent"><samp><span class="cmbold"><samp>Again and again in
the <em>Vedas </em>and from <em>satgurus </em>we hear “Aha&#7745;
Brahm&#x101;smi,” “I am God,” and that God is both immanent and
transcendent. Taken together, these are clear statements of monistic
theism. Aum Nama&#x1e25; &#x15a;iv&#x101;ya.</samp></span></samp></p>
<h4 class="h4"><samp><span
class="smallcap"><samp>BH&#256;SHYA</samp></span></samp></h4>
<p class="noindent"><samp>Monistic theism is the philosophy of the <span
class="cmitalic"><samp>Vedas</samp></span>. Scholars have long noted
that the Hindu scriptures are alternately monistic, describing the
oneness of the individual soul and God, and theistic, describing the
reality of the Personal God. One cannot read the <span
class="cmitalic"><samp>Vedas</samp></span>, <span
class="cmitalic"><samp>&#x15a;aiva &#256;gamas</samp></span> and hymns
of the saints without being overwhelmed with theism as well as monism.
Monistic theism is the essential teaching of Hinduism, of &#x15a;aivism.
It is the conclusion of Tirumular, Vasugupta, Gorakshanatha, Bhaskara,
Srikantha, Basavanna, Vallabha, Ramakrishna, Yogaswami, Nityananda,
Radhakrishnan and thousands of others. It encompasses both
Siddh&#x101;nta and Ved&#x101;nta. It says, God is and is in all things.
It propounds the hopeful, glorious, exultant concept that every soul
will finally merge with &#x15a;iva in undifferentiated oneness, none
left to suffer forever because of human transgression. The <span
class="cmitalic"><samp>Vedas</samp></span> wisely proclaim, “Higher
and other than the world-tree, time and forms is He from whom this
expanse proceeds—the bringer of <span
class="cmitalic"><samp>dharma,</samp></span> the remover of evil, the
lord of prosperity. Know Him as in one’s own Self, as the immortal
abode of all.” Aum Nama&#x1e25; &#x15a;iv&#x101;ya.</samp></p>

Goal is to create a tool for volunteers to go in and extract quotes to
allow them to grab a few sentences, which we will them push to an online
database.

So: What is the best way to get this text rendered? Do I go the path of
setting the field's Unicode? But then what about the html mark up? if we
create a browser object... can users select text and does LC know that
there is a selected chunk if it is inside a browser object?

Before I start wading into this I though to see if anyone else has some
good guidance in advance,


Swasti Astu, Be Well!
Brahmanathaswami

Kauai's Hindu Monastery
www.HimalayanAcademy.com



_______________________________________________
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode

_______________________________________________
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode

Reply via email to