On Sat, Mar 24, 2012 at 8:29 PM, superrubiroyd <[email protected]> wrote: > Hi, Yegor > I am writing in reply of our previous discussion about the participiation in > GSoC 2012. > I have read a lot of documentation for file structures and have become much > more familiar with POI code. > I have installed Apache POI project on my system from source codes and I > have worked with the bugs which you recommended me. > bug 52272. I reproduced this problem and understood it's reason. When the > sheet is cloned all of their drawing records assembles into EscherAggregate > record and when you call clonedSheet.getDrawingPatriarch() it checks if this > sheet contains any drawing records. If they are already assembled into > EscherAggregate record it returns null. So, to fix this bug enough to remove > checking of drawing records in HSSFSheet.getDrawingEscherAggregate(). The > patch what already fixes this bug was already attached to > https://issues.apache.org/bugzilla/show_bug.cgi?id=52272 when I visited this > page and I don't know why it wasn't applied. >
The main reason why Bugzilla 52272 is still open is that the patch is missing unit tests. It touches rather sensitive bits of HSSF and I'd rather not apply it in its current form. On a general note, if your proposal is approved and we work on this project, then your code MUST be backed by unit tests. I know that drawing stuff is hard to test - often you need to open Excel to verify that the output is correct, but you will need to cover as much as possible with junit tests. > > I have thought about the idea you proposed: >>Improve support for drawings in HSSF. The main problem is that you >>can create new drawings from scratch but cannot modify existing ones. >>This task will require deep understanding of the Excel Biff8 and MS >>Binary Drawing(Escher) formats and current architecture of the HSSF >>module. > > As I understand to fix this problem I have to add to each class what extends > HSSFShape some methods to modify exisiting object and add some api to > HSSFPatriarch for searching for shapes by some criterias. Am I right? > The main drawback of the current architecture is that HSSF shapes are detached from their low-level drawing records. Every time you save a sheet, its drawing container is re-created and this is not good. Firstly, it means that POI does not preserve existing stuff. A drawing can contain things that POI does not support but those must be preserved. In fact, when POI recreates the drawing it saves only shapes supported in the API. Secondly, HSSFShape and its subclasses are "pure models", that is, all shape properties are stored in class fields as primitive variables. This means that a subclass of HSSFShape can store only a limited set of properties that are explicitly defined in the class. This is not good as well because a shape can hold tens or even hundreds of properties while HSSF makes sense only of a few ones. Very roughly, my ideas are below: - HSSFShape should keep its EscherContainerRecord in a class field. All getters/ setters of shape properties should propagate the changes to the Escher layer. - when reading shapes from a patriarch you need to search for shape containers and pass them to new instances of HSSShape. - there needs to be a concept of "default" shape: shapes not supported by POI will be initialized with this new class. - when a new shape is created, a new EscherContainerRecord is created, attached to the drawing and passed to the created HSSShape. - "save" logic should be simplified: actually it should only save the root container that holds individual shapes. and the big one: - tests, more tests and even more tests :) Cheers, Yegor > Thanks you much. > With best regards, Evgeniy Berlog. > > I am sorry about my poor English(( --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
