Re: Small, but old bug

2015-04-09 Thread Andrew Pinder
In message <20150409203946.gb19...@kyllikki.org>
 on 9 Apr 2015 Vincent Sanders  wrote:

> Fixing this is earmarked for our 4.0 series and needs a re-written
> render engine. The new engine is a job comparable in size to the
> entire project to date and has not yet been started.

You guys are brave - or maybe foolhardy!

Thanks for your efforts so far :-)


Regards

Andrew
-- 
Andrew Pinder



Re: Small, but old bug

2015-04-09 Thread Vincent Sanders
On Thu, Apr 09, 2015 at 05:20:52PM +0200, David Feugey wrote:
> When I made a page with accents, all is OK with Unicode.
> For example "élément"
> 
> But if I use HTML codes, (éléments), NetSurf considers that
> there are 3 words "é"+"lé"+"memts". A cut after each special characters.
> 
> And so carriage return is sometimes applied at the wrong place.
> Will this bug be corrected?
> 
> Bye, David

This is in the tracker already as bugs #467 [1], #408 [2] and #476 [3]

It is caused because our text reflow and word breaking algorithm does
not meet the standard and breaks words where it really ought not to.

For an explanation of the complexity involved in reflowing text
efficiently on constrained system computerphile did a couple of
excellent videos. [4][5] While these specifically talk about e-readers,
a browser faces similar challenges (and more)

Fixing this is earmarked for our 4.0 series and needs a re-written
render engine. The new engine is a job comparable in size to the
entire project to date and has not yet been started. 

[1] http://bugs.netsurf-browser.org/mantis/view.php?id=467
[2] http://bugs.netsurf-browser.org/mantis/view.php?id=408
[3] http://bugs.netsurf-browser.org/mantis/view.php?id=476
[4] https://www.youtube.com/watch?v=kzdugwr4Fgk
[5] https://www.youtube.com/watch?v=CdbvgRqyC-0

-- 
Regards Vincent
http://www.kyllikki.org/



Re: Small, but old bug

2015-04-09 Thread Peter Young
On 9 Apr 2015  David Feugey  wrote:

> When I made a page with accents, all is OK with Unicode.
> For example "élément"

> But if I use HTML codes, (éléments), NetSurf considers that
> there are 3 words "é"+"lé"+"memts". A cut after each special characters.

> And so carriage return is sometimes applied at the wrong place.
> Will this bug be corrected?

Certainly not, if nobody reports it on the bug tracker.

Best wishes,

Peter.

-- 
Peter Young (zfc Re) and family
Prestbury, Cheltenham, Glos. GL52, England
http://pnyoung.orpheusweb.co.uk
pnyo...@ormail.co.uk



Re: Small, but old bug

2015-04-09 Thread cj
In article
,
   David Feugey  wrote:
> And so carriage return is sometimes applied at the wrong place.
> Will this bug be corrected?

I think this 'bug' has been present for a long time, where NetSurf
puts in a line break at a tag or entity, splitting words etc. You
read the ROOL forum. Have you not noticed that words like you're or
won't or I'll get split if the html code for the right single quote
is being used. These abbeviations are used a lot in the forum, so the
splitting at the end of a line is quite common. It also happens with
tags if they are used in the middle of a word, eg if a sequence such
as isopropyl is being used. I am working on converting some
issues of Archive magazine to html at the moment and the sequence
RISC OS is used a lot, so it gets split regularly.

-- 
Chris Johnson



Small, but old bug

2015-04-09 Thread David Feugey
When I made a page with accents, all is OK with Unicode.
For example "élément"

But if I use HTML codes, (éléments), NetSurf considers that
there are 3 words "é"+"lé"+"memts". A cut after each special characters.

And so carriage return is sometimes applied at the wrong place.
Will this bug be corrected?

Bye, David