Re: Skype-conference on page-breaking?
Sounds to me like 2) is the way to go right now. This would mean minimal recreation of vertical boxes in case of changing available IPD. Sure, this is an exotic case but XSL-FO makes it possible, therefore we must be prepared for it. Thanks for the hints and the helpful example. On 08.03.2005 19:43:57 Luca Furini wrote: Jeremias Maerki wrote: Luca, do you think your total-fit approach may be written in a way to handle changing available IPDs and that look-ahead can be disabled to improve processing speed at the cost of optimal break decisions? I think that a first fit algorithm could be implemented in two different ways: 1) wait until the list of elements representing a whole page-sequence is collected, and call findBreakingPoints(); this method will call a different considerLegalBreak() method, much simpler and faster than the knuth's one. 2) start building pages little by little: the FlowLM returns elements to the PageLM as soon as one of its own child returns them Alternative 1) is much like the total fit algorithm: breaks are computed at the end of each page-sequence; even if the evaluation method is much faster than Knuth's one, there could still be a long wait in order to get the whole list. With alternative 2) the PageLM would behave much the same as it now does: as soon as a page is filled, it is possible to call addAreas. Note that the last elements in the partial sequence cannot be considered as feasible break. For example, if there is a block which creates 6 lines, the sequence will be something like: box box penalty(not infinite) box penalty(not infinite) box box and the evaluation must stop at the second penalty; only when some following elements are known it will be possible to decide whether the last two lines could be at the end of a page. If the IPD is always the same, I think the two alternatives are equivalent, and the first one is better because it just needs a different considerLegalBreak() method; as the output file cannot be printed until the end of the process, the only advantage of 2) could be memory usage. That's the part where I have a big question mark about changing available IPD. We may have to have a check that figures out if the available IPD changes within a page-sequence by inspecting the page-masters. That would allow us to switch automatically between total-fit and best-fit or maybe even first-fit. If the IPD changes, I fear 2) must be necessarily used: if a block is split between pages with different ipd, only a few lines need to be recreated. Using 1), the LineLM should know how wide the lines are, but this cannot be known as page breaking has not yet started. The check could be done before starting the layout phase: if there is a change, 2) is used, otherwise 1). Maybe, the check could be even more sophisticated: for example, if the first page is different, but the following are equally wide, we could use 2) to create the first page and then switch to 1). A remaining question mark is with side-floats as they influence the available IPD on a line-to-line basis. This is a question mark for me too! :-) One thing for a deluxe strategy for book-style documents is certainly alignment of lines between facing pages. But that's something that's not important at the moment. I have created and implemented a new property right about this! :-) I'd be very interested to hear what you think about the difficulty of changing available IPD. The more I think about it, however, the more I think the total-fit model gets too complicated for what we/I need right now. But I'm unsure here. If changing ipd is really important and not just a theorical possibility, we could start implementing 2, and later add the check and the algorithm 1: the getNextKnuthElements() in the block-level LM could be used in both cases. Regards Luca Jeremias Maerki
Re: Skype-conference on page-breaking?
Jeremias Maerki wrote: Luca, do you think your total-fit approach may be written in a way to handle changing available IPDs and that look-ahead can be disabled to improve processing speed at the cost of optimal break decisions? I think that a first fit algorithm could be implemented in two different ways: 1) wait until the list of elements representing a whole page-sequence is collected, and call findBreakingPoints(); this method will call a different considerLegalBreak() method, much simpler and faster than the knuth's one. 2) start building pages little by little: the FlowLM returns elements to the PageLM as soon as one of its own child returns them Alternative 1) is much like the total fit algorithm: breaks are computed at the end of each page-sequence; even if the evaluation method is much faster than Knuth's one, there could still be a long wait in order to get the whole list. With alternative 2) the PageLM would behave much the same as it now does: as soon as a page is filled, it is possible to call addAreas. Note that the last elements in the partial sequence cannot be considered as feasible break. For example, if there is a block which creates 6 lines, the sequence will be something like: box box penalty(not infinite) box penalty(not infinite) box box and the evaluation must stop at the second penalty; only when some following elements are known it will be possible to decide whether the last two lines could be at the end of a page. If the IPD is always the same, I think the two alternatives are equivalent, and the first one is better because it just needs a different considerLegalBreak() method; as the output file cannot be printed until the end of the process, the only advantage of 2) could be memory usage. That's the part where I have a big question mark about changing available IPD. We may have to have a check that figures out if the available IPD changes within a page-sequence by inspecting the page-masters. That would allow us to switch automatically between total-fit and best-fit or maybe even first-fit. If the IPD changes, I fear 2) must be necessarily used: if a block is split between pages with different ipd, only a few lines need to be recreated. Using 1), the LineLM should know how wide the lines are, but this cannot be known as page breaking has not yet started. The check could be done before starting the layout phase: if there is a change, 2) is used, otherwise 1). Maybe, the check could be even more sophisticated: for example, if the first page is different, but the following are equally wide, we could use 2) to create the first page and then switch to 1). A remaining question mark is with side-floats as they influence the available IPD on a line-to-line basis. This is a question mark for me too! :-) One thing for a deluxe strategy for book-style documents is certainly alignment of lines between facing pages. But that's something that's not important at the moment. I have created and implemented a new property right about this! :-) I'd be very interested to hear what you think about the difficulty of changing available IPD. The more I think about it, however, the more I think the total-fit model gets too complicated for what we/I need right now. But I'm unsure here. If changing ipd is really important and not just a theorical possibility, we could start implementing 2, and later add the check and the algorithm 1: the getNextKnuthElements() in the block-level LM could be used in both cases. Regards Luca
Re: Skype-conference on page-breaking?
Thanks, Luca. I've had a nice casual talk on the phone with Simon, yesterday. Essentially, we only talked about very high-level stuff, especially the decision for a certain strategy (or two). You know I came up with the idea to create a simpler best-fit strategy with no look-ahead for invoice-style documents but maybe it would be possible to design your obvious total-fit strategy in a way that it could be used as a best-fit without look-ahead. The problem, like I mentioned already, is the possible change of available IPD within a page-sequence which results in a possible back-tracking and recalculation of vertical boxes. Of course, if it's possible to stay with one page-breaking algorithm for all use cases that would be best (because of the reduced effort), but only if the algorithm is reasonably fast for invoice-style documents. I'm repeatedly confronted with certain speed requirements in this case. Since modern high-volume single-feed printers handle about 180 pages per minute (continuous feed systems handle over 4 times that speed, but I think that's neither relevant, nor realistic here) FOP should be able to operate close to these 180 pages per minute for not too complex documents on a modern server. That means 330ms per page. Not much. Of course, in such an environment it is possible to distribute the formatting process over several blade servers but I had to realize that certain companies tend to prefer spending 100'000 dollars on a big server than spending a lot less for a much faster CPU-power-oriented setup. It seems to be hard to say good-bye to the old host systems. Well, that's just like the reality looks like in my environment. Simon, for example, is much more interested in book-style documents where there are other requirements. Speed is not a big issue, but quality is. In the end, I think we need to rate the chosen approach in these two points of view. These are very contradicting requirements and it's something that seems quite important to me not to forget here. Luca, do you think your total-fit approach may be written in a way to handle changing available IPDs and that look-ahead can be disabled to improve processing speed at the cost of optimal break decisions? If it's ok for you (and feasible) I'd like to integrate what you already have (in code) into that branch I was talking about. I would like to avoid recreating something you've already started, even if it doesn't work with the changes that happened in the last weeks. Even if we may create two different strategies I'm sure that certain parts will be shared by both approaches, like the creation of Knuth-style elements for the PageLM. Some more comments inline: On 04.03.2005 13:23:01 Luca Furini wrote: Jeremias Maerki wrote: Would you consider sharing what you already have? This may help us in the general discussion and may be a good starting point. Ok, I'll try to. The main change in the LineLM is that the line breaking algorithm does not select only the node in activeList with fewest demerits: all the nodes whose demerits are = a threshold are used to create LineBreakPositions, so for each paragraph there is a set of layout options (for example, a paragraph could create 8 to 10 lines, 9 being the layout with fewest demerits). Hmm, that's a feature that I would say is something that only book-style documents will need. Invoice-style documents could live without it. According to the value of widows and orphans, the LineLM creates a sequence of elements: besides normal lines, represented by a box, there are optional lines, represented by box(0) penalty(inf,0) glue(0,1,0) box(0) and removable lines box(0) penalty(inf,0) glue(1,0,1) box(0) A few complications arise if not every possible layout allows breaks between lines, but they all can be solved using boxes, glues and penalties (for example, if a paragraph needs 3 or 4 lines, if it uses 3 it cannot be parted). Also something that's not all too important for invoice-style documents, although it can't hurt to have it. The BlockLM, and a block stacking LM in general, adds elements representing its children's spaces and keep condition, for example adding a 0 penalty or an infinite penalty according to child1.mustKeepWithNext(), child2.mustKeepWithPrevious() and this.mustKeepTogether(). That's certainly a must-have in any case. The PageLM, once it has the list of elements representing a whole page-sequence (or the content before a forced page break), calls the same breaking algorithm, using only a different selection method which leaves only one node in activeList. That's the part where I have a big question mark about changing available IPD. We may have to have a check that figures out if the available IPD changes within a page-sequence by inspecting the page-masters. That would allow us to switch automatically between total-fit and best-fit or maybe even first-fit. A remaining question mark is with side-floats as they influence
Re: Skype-conference on page-breaking?
I don't know why this is important to you but it's two to three months. On 04.03.2005 12:40:04 Peter B. West wrote: Jeremias Maerki wrote: Sounds very interesting. Would you consider sharing what you already have? This may help us in the general discussion and may be a good starting point. My problem is that I have to deliver working page breaking with keeps, breaks, multi-column, adjustable spacing etc. in a relatively short period of time. How short? Peter -- Peter B. West http://cv.pbw.id.au/ Project Folio http://defoe.sourceforge.net/folio/ Jeremias Maerki
Re: Skype-conference on page-breaking?
Jeremias Maerki wrote: I don't know why this is important to you Just curious. but it's two to three months. Ouch. Good luck. You might want to keep an eye on Folio. Peter On 04.03.2005 12:40:04 Peter B. West wrote: Jeremias Maerki wrote: Sounds very interesting. Would you consider sharing what you already have? This may help us in the general discussion and may be a good starting point. My problem is that I have to deliver working page breaking with keeps, breaks, multi-column, adjustable spacing etc. in a relatively short period of time. How short? -- Peter B. West http://cv.pbw.id.au/ Project Folio http://defoe.sourceforge.net/folio/
Re: Skype-conference on page-breaking?
Ok then, I'll call you Sunday evening 19.00 CET if nothing goes wrong. The others interested will find me in Skype. FYI, I'll be out of touch from later today until Sunday afternoon. On 03.03.2005 21:46:55 Simon Pepping wrote: On Thu, Mar 03, 2005 at 08:34:54PM +0100, Jeremias Maerki wrote: I've bought some SkypeOut credits now. Funny thing: It's cheaper to call Simon in the Netherlands than to call someone in Lucerne via PSTN. Anyway, I'd like to ask if we could hold to a brainstorming conference call on page breaking either Sunday evening or next Monday or Tuesday somewhere between 8:00 and 24:00 CET. Of course, on my wish list there are Simon, Finn and Luca. I'm happy to call either of you on your normal phone via SkypeOut if you don't have broadband. I hope I can get at least one of you three on the line. Others are invited to listen in and contribute, of course. Max. number in the conference is four people with Skype. Sunday evening is OK. Monday and Tuesday after working hours is OK. I could be available from 16.00 hrs, but I would prefer after 19.00 hrs CET. There is no way I can do this at the office. Regards, Simon -- Simon Pepping home page: http://www.leverkruid.nl Jeremias Maerki
Re: Skype-conference on page-breaking?
Jeremias Maerki wrote: Anyway, I'd like to ask if we could hold to a brainstorming conference call on page breaking either Sunday evening or next Monday or Tuesday somewhere between 8:00 and 24:00 CET. Of course, on my wish list there are Simon, Finn and Luca. I'm happy to call either of you on your normal phone via SkypeOut if you don't have broadband. I hope I can get at least one of you three on the line. I'v very interested in page breaking, and I would be happy to contribute. Unfortunately, I'm not much used to speaking english :-(, so I think I would be much more comfortable with the idea of communicating via written words! As I have said before (or maybe I forgot to ...) I have done a few experiments trying to use Knuth's algorithm in page braking, and I have a working implementation which handles only some block level formatting objects (blocks and lists) and simplified documents (no footnotes or floats, at the moment, and pages with equal length and width), but it has some (I hope) interesting features: for example, it is able to adjust the number of lines used for each paragraph in order to both fill the pages and avoid orphans and widows. In a few words, using the box - penalty - glue model it is possible to represent paragraphs with an adjustable number of lines. I started working on it a few months ago, and I could not keep it updated with all the changes, but if you are interested I could try and recreate these features using the most recent code. Anyway, this could be done after we have reached a basic implementation. Regards Luca
Re: Skype-conference on page-breaking?
Sounds very interesting. Would you consider sharing what you already have? This may help us in the general discussion and may be a good starting point. My problem is that I have to deliver working page breaking with keeps, breaks, multi-column, adjustable spacing etc. in a relatively short period of time. On 04.03.2005 11:09:42 Luca Furini wrote: Jeremias Maerki wrote: Anyway, I'd like to ask if we could hold to a brainstorming conference call on page breaking either Sunday evening or next Monday or Tuesday somewhere between 8:00 and 24:00 CET. Of course, on my wish list there are Simon, Finn and Luca. I'm happy to call either of you on your normal phone via SkypeOut if you don't have broadband. I hope I can get at least one of you three on the line. I'v very interested in page breaking, and I would be happy to contribute. Unfortunately, I'm not much used to speaking english :-(, so I think I would be much more comfortable with the idea of communicating via written words! As I have said before (or maybe I forgot to ...) I have done a few experiments trying to use Knuth's algorithm in page braking, and I have a working implementation which handles only some block level formatting objects (blocks and lists) and simplified documents (no footnotes or floats, at the moment, and pages with equal length and width), but it has some (I hope) interesting features: for example, it is able to adjust the number of lines used for each paragraph in order to both fill the pages and avoid orphans and widows. In a few words, using the box - penalty - glue model it is possible to represent paragraphs with an adjustable number of lines. I started working on it a few months ago, and I could not keep it updated with all the changes, but if you are interested I could try and recreate these features using the most recent code. Anyway, this could be done after we have reached a basic implementation. Regards Luca Jeremias Maerki
Re: Skype-conference on page-breaking?
Jeremias Maerki wrote: Sounds very interesting. Would you consider sharing what you already have? This may help us in the general discussion and may be a good starting point. My problem is that I have to deliver working page breaking with keeps, breaks, multi-column, adjustable spacing etc. in a relatively short period of time. How short? Peter -- Peter B. West http://cv.pbw.id.au/ Project Folio http://defoe.sourceforge.net/folio/
Re: Skype-conference on page-breaking?
I've bought some SkypeOut credits now. Funny thing: It's cheaper to call Simon in the Netherlands than to call someone in Lucerne via PSTN. Anyway, I'd like to ask if we could hold to a brainstorming conference call on page breaking either Sunday evening or next Monday or Tuesday somewhere between 8:00 and 24:00 CET. Of course, on my wish list there are Simon, Finn and Luca. I'm happy to call either of you on your normal phone via SkypeOut if you don't have broadband. I hope I can get at least one of you three on the line. Others are invited to listen in and contribute, of course. Max. number in the conference is four people with Skype. On 01.03.2005 23:31:16 Jeremias Maerki wrote: Maybe I could hook you into a Skype conference by using SkypeOut. It's pretty cheap to call to the Netherlands. According to the FAQ this is possible. On 01.03.2005 22:26:50 Simon Pepping wrote: On Tue, Mar 01, 2005 at 03:09:46PM +0100, Jeremias Maerki wrote: To speed things up could we hold a conference (using Skype, for example) to discuss further details on page-breaking? I'd volunteer to sum up any results during that discussion for the archives. I have Finn on my Skype radar already. I do not have a broadband connection, and therefore no Skype or other VoIP. Jeremias Maerki Jeremias Maerki
Re: Skype-conference on page-breaking?
On Thu, Mar 03, 2005 at 08:34:54PM +0100, Jeremias Maerki wrote: I've bought some SkypeOut credits now. Funny thing: It's cheaper to call Simon in the Netherlands than to call someone in Lucerne via PSTN. Anyway, I'd like to ask if we could hold to a brainstorming conference call on page breaking either Sunday evening or next Monday or Tuesday somewhere between 8:00 and 24:00 CET. Of course, on my wish list there are Simon, Finn and Luca. I'm happy to call either of you on your normal phone via SkypeOut if you don't have broadband. I hope I can get at least one of you three on the line. Others are invited to listen in and contribute, of course. Max. number in the conference is four people with Skype. Sunday evening is OK. Monday and Tuesday after working hours is OK. I could be available from 16.00 hrs, but I would prefer after 19.00 hrs CET. There is no way I can do this at the office. Regards, Simon -- Simon Pepping home page: http://www.leverkruid.nl
Skype-conference on page-breaking?
To speed things up could we hold a conference (using Skype, for example) to discuss further details on page-breaking? I'd volunteer to sum up any results during that discussion for the archives. I have Finn on my Skype radar already. Jeremias Maerki
Re: Skype-conference on page-breaking?
I would be please to listen. Renaud
Re: Skype-conference on page-breaking?
On Tue, Mar 01, 2005 at 03:09:46PM +0100, Jeremias Maerki wrote: To speed things up could we hold a conference (using Skype, for example) to discuss further details on page-breaking? I'd volunteer to sum up any results during that discussion for the archives. I have Finn on my Skype radar already. I do not have a broadband connection, and therefore no Skype or other VoIP. Regards, Simon -- Simon Pepping home page: http://www.leverkruid.nl
Re: Skype-conference on page-breaking?
I'd be happy to 'participate' although I don't have a skype acct yet. I don't know what I can offer, but I'm here to help! Cheers! On Mar 1, 2005, at 2:31 PM, Jeremias Maerki wrote: Maybe I could hook you into a Skype conference by using SkypeOut. It's pretty cheap to call to the Netherlands. According to the FAQ this is possible. On 01.03.2005 22:26:50 Simon Pepping wrote: On Tue, Mar 01, 2005 at 03:09:46PM +0100, Jeremias Maerki wrote: To speed things up could we hold a conference (using Skype, for example) to discuss further details on page-breaking? I'd volunteer to sum up any results during that discussion for the archives. I have Finn on my Skype radar already. I do not have a broadband connection, and therefore no Skype or other VoIP. Jeremias Maerki Web Maestro Clay -- [EMAIL PROTECTED] - http://homepage.mac.com/webmaestro/ My religion is simple. My religion is kindness. - HH The 14th Dalai Lama of Tibet