I've submitted a patch for
https://issues.apache.org/bugzilla/show_bug.cgi?id=56854

-----Original Message-----
From: Yaniv Kunda [mailto:[email protected]]
Sent: Thursday, August 14, 2014 17:08
To: POI Developers List
Subject: RE: Performanc​e issues related to XMLBeans

I started working on patches for this issue, which I opened as bug #56854:
https://issues.apache.org/bugzilla/show_bug.cgi?id=56854


-----Original Message-----
From: Jason Harrop [mailto:[email protected]]
Sent: Thursday, August 7, 2014 02:51
To: POI Developers List
Subject: Re: Performanc​e issues related to XMLBeans

docx4j - which my company sponsors - uses JAXB and has done since its
inception.   The docx4j jar, including classes for the schemas, is
around 4 MB.

I've attached below a list of package names, to give you an idea of which
schemas are included (namely schemas from the standard, plus
some additional Microsoft ones).   The 4MB includes complete generated
classes; we don't need to offer a partial distribution.

Around the time POI was first adopting XML Beans, there was some face to
face discussion in San Fran about merging docx4j and POI, but we didn't
pursue the discussions at the time, as the XML Beans train was already
leaving the station...

cheers .. Jason

-----------------------

org.docx4j.wml

org.pptx4j.pml

org.xlsx4j.sml



org.docx4j.bibliography

org.docx4j.dml

org.docx4j.dml

org.docx4j.dml.chart

org.docx4j.dml.chart

org.docx4j.dml.chartDrawing

org.docx4j.dml.chartDrawing

org.docx4j.dml.compatibility

org.docx4j.dml.compatibility

org.docx4j.dml.diagram

org.docx4j.dml.diagram

org.docx4j.dml.diagram2008

org.docx4j.dml.lockedCanvas

org.docx4j.dml.lockedCanvas

org.docx4j.dml.picture

org.docx4j.dml.picture

org.docx4j.dml.spreadsheetdrawing

org.docx4j.dml.spreadsheetdrawing

org.docx4j.dml.wordprocessingDrawing

org.docx4j.dml.wordprocessingDrawing



org.docx4j.docProps.coverPageProps



org.docx4j.math



org.docx4j.mce



org.docx4j.sharedtypes

org.docx4j.vml

org.docx4j.vml.officedrawing

org.docx4j.vml.presentationDrawing

org.docx4j.vml.root

org.docx4j.vml.spreadsheetDrawing

org.docx4j.vml.wordprocessingDrawing


org.docx4j.schemas.microsoft.com.office.word_2006.wordml

org.docx4j.w14

org.docx4j.w15

org.xlsx4j.schemas.microsoft.com.office.excel_2006.main

org.xlsx4j.schemas.microsoft.com.office.excel_2008_2.main

On Thu, Aug 7, 2014 at 9:14 AM, Andreas Beeker <[email protected]>
wrote:
> Hi Yaniv,
>
>
>> The fact that you described the XMLBeans proxy objects as "leaking"
>> through
>> the API says it all - they should have been encapsulated from the start.
>
> The problem with the ooxml schema(s) is, that it is quite huge - and
> actually there are now at least 4 (ECMA) versions available. Similar
> to the binary format, one of the benefits of POI is, that you can
> modify the documents without destroying not supported elements.
>
> If we hypothetical switch to a different marshaller it should preserve
> the infoset - I've never really tried JAXB (my colleague working with
> complex bipro schemas hates it ...), but that should be possible [1]
>
> So the next decision is, which mechanism should one pick, to not
> wrap/duplicate everything, which the ooxml schema provides. Up till
> now, I can only think of two options.
> Either we provide a strong typed (e.g. pojo based) interface - similar
> to the xmlbeans solution, or it would be some kind of flexible DOM,
> which is more prone for user errors.
> Btw. maybe it was just a act from necessity, but I like the solution
> of not providing the whole schema as classes (i.e. the reduced jar) -
> not sure if this is also possible with jaxb.
>
>
>> The best way to do that in light of its existing exposure is to
>> deprecate it and provide a new alternative wrapper that will
>> eventually supersede it.
>
> Next part: would it be better to either duplicate (all) the classes
> for the new library or to switch inside the current classes depending
> on the config/input. The second approach would probably lead to an
> implementation which encapsulate the library specifics a bit more, so
> in case we decide another time to switch, it might not be so hard
> anymore ... (yeah, I know, totally hypothetical ... )
>
>
>> - the XMLBeans project is dead (in the attic for over a year, reboot
>> is
>> stuck)
>
> there was a time some years ago, when I thought the same thing about
> POI ...
>
>
>> The conclusion that it should be replaced at some point
>
> Yes, I agree with you totally ... maybe ;)
>
>
>> but for the shorter term be hidden and only used as an internal
>> implementation.
>
> Same as the top comment - I don't think we should wrap everything with
> POI classes, but let the user interfere with the "model", whatever
> that would look like.
> As a POI user I never really understood, why certain features were
> made final or protected in the api - and I needed to workaround them
> by either classloading or reflection tricks.
> Of course now as a committer things look a bit different, as you don't
> want to justify for all the handy helper methods and rather make them
> non-public.
>
> Best wishes,
> Andi
>
> [1]
> http://blog.bdoughan.com/2010/09/jaxb-xml-infoset-preservation.html
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected] For additional
> commands, e-mail: [email protected]
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected] For additional
commands, e-mail: [email protected]

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to