On Tue, Oct 26, 2010 at 10:03:01PM +0200, Roland Smith wrote:
> On Tue, Oct 26, 2010 at 12:30:20PM -0700, Gary Kline wrote:
> > On Tue, Oct 26, 2010 at 08:59:24PM +0200, Polytropon wrote:
> > > On Tue, 26 Oct 2010 11:38:20 -0700, Liontaur <[email protected]> wrote:
> > > > Related but slightly OT, I've never had much luck getting it the other 
> > > > way
> > > > around, HTML to PDF. It's often off a bit. I can't remember off the top 
> > > > of
> > > > my head what ports i've tried but yea. Either the images are wonky or my
> > > > forms go wonky.
> > > 
> > > This is simply because HTML is not typesetting-capable. Depending
> > > on the source of the PDF file, it may help to convert from THAT
> > > format instead from PDF. E. g. if you have a .tex (LaTeX) file
> > > that has been the source of the PDF file, you can use a converter
> > > from LaTeX to HTML, often with acceptable results.
> > > 
> > > The HTML concept, especially when incorporating CSS for formatting,
> > > _can_ be used to gain a bit typographic quality, e. g. by defining
> > > parameters for "screen" and for "printed" media. Still it suffers
> > > from things like maintaining good grey values, hypenation and
> > > ligatures.
> 
> You can add proper justification to the list that HTML doesn't do well!
> 
> >     Hmm. The ligatures that looked so great in my .tex/PDF output
> >     got lost.
> 
> Very few programs do ligatures well. If you're using unicode text, you can use
> them directly in your text, like this: ??? ??? ??? ??? ??? ???
> 
> How well these look depends on the fonts used. I've got a whole list of handy
> unicode characters on my webpage. See the entry marked 2010-10-16.
> 
> >       Only that somehow, HTML4 can read the hex code that
> >     abiword's html created.  :-)   Also, the `` and '' look great in
> >     Times.  I fixed the page numbers--all had to go away; I edited
> >     the chapter headings--all by hand.  What's left are the hundreds
> >     of broken paragraphs.
> 
> You might fare better by taking the TeX souce, run it though detex(1) and use
> markdown [http://daringfireball.net/projects/markdown/] do create HTML.
> 
> >     What utility take a LaTeX file -> HTML?  ((Be nice to have both
> >     *strictly professional typeset* and then HTML.  I can add
> >     indents for AE style paragraphing, and much more.  Fix the
> >     hyphenation, etc.
> 
> Next to the obvious textproc/latex2html? :-)
> 



        Yeah, found it with locate!  And found some very interesting
        results.  I haven't check my .tex source, but the latex2html 
        produces some **very** interesting results.  

        In my lates j.html file there are hundreds of "broken
        paragraphs" such as:

                She stopped and
                turned around.

                "What?" he said.

                "I just thought I'= taking the 
                wrong course."  

        And so on.  There is a "<br>" embedded in hundreds of
        paragraphs.  Now I  have the output from latex2html to check
        against, things can be that much easier.  Do you or does any
        regex wiz have a way of catching embedded <br>'s within
        sentences?

        It might save me.  It would certainly make things _easier_!
        I'll play around with /[a-zA-z]<br><[A-Za-z].   Hope the 
        &gt; and &lt aren't a problem in regexland....  :-)


        thanks much,

        gary


> Roland
> -- 
> R.F.Smith                                   http://www.xs4all.nl/~rsmith/
> [plain text _non-HTML_ PGP/GnuPG encrypted/signed email much appreciated]
> pgp: 1A2B 477F 9970 BA3C 2914  B7CE 1277 EFB0 C321 A725 (KeyID: C321A725)



-- 
 Gary Kline  [email protected]  http://www.thought.org  Public Service Unix
    The 7.90a release of Jottings: http://jottings.thought.org/index.php
                           http://journey.thought.org
                                        
_______________________________________________
[email protected] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "[email protected]"

Reply via email to