Re: [dev] Re: Grand Concept, splitting up the monolith, dynamic content

Mathias Bauer Fri, 26 Sep 2008 16:40:07 -0700

Yegor Jbanov wrote:

> Amen to that!
> 
> Plus:
> 
> 1. Documentation is the suckiest thing about OpenOffice.org SDKs and
> APIs. If OOo has been modularized, then nobody ever noticed. 
May be because it's always easier to maintain prejudices than to
actually check them against reality from time to time. I don't say that
OOo is perfectly modularized, but it's also far away from being a
monolith. As I explained in my reply to Rene IMHO people tend to think
that just because one particular aspect of modularity is not visible
that their can't be any kind of it. And that's not true.


> Is there
> a module dependency diagram anywhere? (This was a rhetorical question,
> of course.) We need a central, comprehensive, well-organized and
> up-to-date documentation web-site (take a look at how Google documents
> their toolkits). The official documentation is so badly
> organized/out-of-date/incorrect/incomplete that even Google fails to
> find relevant information. My major source has been so far the mail
> archive where people report their questions and sometimes get their
> answers, yet nobody ever cares to update the documentation so that
> others could find it easily.

I agree that our documentation needs improvement (you could volunteer to
help). But with some good will you can find a lot of interesting things
in the Developer's Guide. It e.g. explains how the application framework
of OOo works and you can indeed see this as a documentation of the
modular structure of OOo.

There are parts of OOo that lack modularization, but even where the
modularization is missing on package or library level there may be clear
architecture on the code level.

The new chart component that we added in OOo2.3 is a good example for
what is there and how it can be used. All three parts of this
"application" (model, view and controller code) are in a separate
library. And there are no dependencies of the Framework on any of these
libraries, objects from these libraries are instantiated as UNO
services. You can remove Chart from the installation without breaking
anything - except sloppy written code that expects that "their always is
a Chart". But this is not a problem of bad modularization or
architecture, that's just a bug.

The same separation of application and framework BTW is true for Writer,
Calc etc. also, thanks to the very modular and abstract design of the
framework. But admittedly the *internal* structure of these "modules"
lacks modularization. Currently only the dialogs of e.g. Writer are in a
separate libary. My idea is to extend that to the whole UI code
somewhere in the future. But this is quite some work to do and we can't
take ourselves out of the ongoing development for a year or so, so we
have to work on modularization along the way.

The biggest problem in this area still are our Drawing Layer, the
EditEngine/Outliner and the forms layer that together totally undermine
any attempt to implement a model/view/controller separation in Writer (I
can't speak for Calc and Draw/Impress here). But I know that a very
motivated developer is trying to fix that even if it costs him several
years of his life time. ;-)

> 2. OOo file filters must become a standalone project that could be
> shared with KOffice, AbiWord and others. In general, having ability to
> use filters outside OOo is a major advantage. 
Are you sure that you know how filters work? They don't convert from
format A to format B, they convert from a format to an API or "core
model". As long the applications don't share the core model and the API
they never will be able to share the filter code.

Of course you can share parts of the filters, e.g. as in case of the
libwpd that converts the imported format to a somewhat idealized model.
But you always will need some code around it that adapts this to the
concrete model of the application you want to import to.

Here's an example: our new docx import filter consists of three
components. One is the parser/tokenizer component that scans the file
and generates kind of events that make up an idealized and very
low-level model. Another component, the so called "domain mapper"
converts this into API calls using the API of the document core model.
The API builds up a still idealized but already very concrete model. The
implementation of this API can be seen as the third part and it can
adapt from the still idealized API view to the very bits and bytes of
the C++ source code. As the three parts talk to each other through
defined and stable interfaces basically each part can be exchanged by
another implementation. How much more modularization do you want to have?

By far the most code is in the latter part of the filter (the API
implementation) and of course this one can't be shared with other
applications as this would require that they use the same internal
implementation. But even the next big component, the Domain Mapper, is
not easily shareable, as this would require that the applications shared
the component model (OOo uses UNO) and the API based on it. But the
inability to share these parts is not caused by missing modularization,
I hope this has become clear from my description.

So what you can share is the scanner/tokenizer, if you are willing to
plug it into the code of your application. This is only a small part of
the filter, but it's possible. You can try. :-)

I never investigated the code of the Word Perfect import filter, but
IIRC the libwpd also can be seen as the scanner/tokenizer part of the
filter that can be shared between applications.

> There are so many
> use-cases for filters other than opening Word files for editing in
> OOo. Content management systems (Alfresco), reporting software
> (JFreeReports), document intelligence (redaction), web-office suites
> (Zoho, GDocs), etc, all need multi-format support. 
You are not talking about filters but about converters. A converter is a
shortcut between an import and an export filter. It's not necessary to
share converters with other applications on code or module level, they
are standalone-applications as they communicate with other apps through
files, not through code.

I assume that you want to have such a standalone-application based on
the OOo filters. I agree that this would be fine.

But even these convertes will need to contain the core model of the
application their filters are based on. And this will pull in a major
part of OOo, regardless if it's modularized or not.

Admittedly currently this will pull in even some unnecessary code, as
e.g. the UI code of Writer that surely isn't needed in a converter. And
this is one reason why I would like to separate all UI code from the
core model code so that we could create a converter that does not
contain it. But I don't see any reason to go further with modularization
and e.g. split up the core into modules as this would be effort with a
small advantage but several disadvantages. One is a possible performance
penalty caused by additional interface layers and another one is a more
basic consideration.

IMHO all code that is needed to work with the application's feature set
is mandatory code and it must be part of even the smallest possible
converter or any other application you want to build on OOo's
capabilities (that are themselves based on ODF).

Whether this mandatory code is modularized internally or not is
completely irrelevant for the converter - it may be advantegeous to make
larger, rarely used features loadable on demand (e.g. to speed up the
startup of OOo), but at least they must be part of the whole set. I
never would like to see any code associated with OOo that would not be
able to deal with ODF in its completeness.

Ciao,
Mathias

-- 
Mathias Bauer (mba) - Project Lead OpenOffice.org Writer
OpenOffice.org Engineering at Sun: http://blogs.sun.com/GullFOSS
Please don't reply to "[EMAIL PROTECTED]".
I use it for the OOo lists and only rarely read other mails sent to it.

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: [dev] Re: Grand Concept, splitting up the monolith, dynamic content

Reply via email to