Re: [Libreoffice] [GSoc] Progress report - Visio import filter

2011-05-26 Thread Cedric Bosdonnat
Hi Fridrich, Eilidh,

On Thu, 2011-05-26 at 09:39 +0200, Fridrich Strba wrote:
> In our private conversation you asked for some guidance about how to
> structure the library. Here are my basic thoughts (that are again my
> thoughts that come from having contributed to several libraries, but
> they are not the God's word):

Good that you bring that conversation public Fridrich :)

> 2) Since in the isSupported function I see that you are distinguishing
> two versions of Visio Document, I would suggest that you write a base
> parser class something like:
> 
> class VSDXParser
> {
> public:
> VSDXParser(WPXInputStream *input);
> ~VSDXParser();
> protected:
> 
> private:
> 
> };
> 
> That would contain common functions for all the formats as long as the
> common state that you will need to keep. It could have two derived
> classes for the n=11 and n=6 

Indeed that's more oriented-object way of coding. That's what we already
did in the DOC / DOCX / RTF export in sw, saves a lot of work and keeps
the code much cleaner.

> Now in the VisioDocument::parse(...) function, one could detect which
> file-format we are parsing, construct the corresponding VSDParser and
> call the parse on it.

Yes!

> 3) As to the development process, I would suggest to first have some dry
> parsing in place, with functions that read the different elements of the
> Visio document without processing them really. You can plant several
> VSD_DEBUG_MSG((...)); statements inside the functions (include the
> libvisio_utils.h and optionally un-comment for the time of heavy
> development the #define VERBOSE_DEBUG=1). Doing so, you get maximum of
> information on your console without actually the parser calling any of
> the interface callbacks. Then you can start from there by actually
> processing the useful content.

Don't hesitate to output loads of TODO printfs to help you monitor what
is missing. This way you'll easily see your progress. Ask Miklos about
it, I think he appreciated the idea last year ;)

> Sometimes, GSoC students are scared that pushing publicly code of
> questionable quality would be detrimental for them when a prospective
> employer googles for their work. This is largely a myth and the evidence
> is that if that was true, I would probably have to have spent all my
> life living on social help :)

To go in Fridrich's way, only few code is perfect from the very
beginning we produce it... but push it as long as it works (doesn't
break important things) and fix it after. Nobody will complain against
that as almost every developer does it (or did it at some point of the
time).

Regards,

-- 
Cédric Bosdonnat
LibreOffice hacker
http://documentfoundation.org
OOo Eclipse Integration developer
http://cedric.bosdonnat.free.fr

___
LibreOffice mailing list
LibreOffice@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/libreoffice


Re: [Libreoffice] [GSoc] Progress report - Visio import filter

2011-05-26 Thread Fridrich Strba
Hello, Eilidh,

In our private conversation you asked for some guidance about how to
structure the library. Here are my basic thoughts (that are again my
thoughts that come from having contributed to several libraries, but
they are not the God's word):

1) Since you will have to parse quite often compressed chunks of stream,
it would maybe be useful to write some class like the following one:

class VSDInternalStream : public WPXInputStream
{
public:
VSDInternalStream(WPXInputStream *input, size_t dataSize, bool
isCompressed);
~VSDInternalStream();
(...)
private:
std::vector m_buffer;
VSDInternalStream();
VSDInternalStream(const VSDInternalStream&)
};

That would be constructed by reading the input of dataSize into the
m_buffer and if needed it would decompress it on the fly if it is
compressed. Like that you would have this task that will be quite
frequent one in one place. The advantage would be that the resulting
stream would be seakable and you would just read it as any other
WPXInputStream.

2) Since in the isSupported function I see that you are distinguishing
two versions of Visio Document, I would suggest that you write a base
parser class something like:

class VSDXParser
{
public:
VSDXParser(WPXInputStream *input);
~VSDXParser();
protected:

private:

};

That would contain common functions for all the formats as long as the
common state that you will need to keep. It could have two derived
classes for the n=11 and n=6 

class VSDParser : protected VSDXParser
{
public:
VSDParser(WPXInputStream *input);
~VSDParser();
parse(libwpg::WPGPaintInterface *iface);
private:

};

Those ones would contain functions specific for the given file-format
version as well as specific state information that cannot be extracted
into the VSDXParser.

Now in the VisioDocument::parse(...) function, one could detect which
file-format we are parsing, construct the corresponding VSDParser and
call the parse on it.

3) As to the development process, I would suggest to first have some dry
parsing in place, with functions that read the different elements of the
Visio document without processing them really. You can plant several
VSD_DEBUG_MSG((...)); statements inside the functions (include the
libvisio_utils.h and optionally un-comment for the time of heavy
development the #define VERBOSE_DEBUG=1). Doing so, you get maximum of
information on your console without actually the parser calling any of
the interface callbacks. Then you can start from there by actually
processing the useful content.

Myself I would write maybe a VSDElement class that would construct
itself by getting the pointer to the current input stream and would have
some kind of processContent function that will decide whether to call
private _readContent(...) for supported elements and _skipContent(...)
for unsupported elements. But again, this is too much of implementation
details and I can clearly confess that I have a bias from what we did in
libwpd and libwpg.

4) The bottom line of a good FOSS development model is to push often
small changes. It has two big advantages:
a) it is easier to bisect changes when something broke;
b) it gives nice overview of progress.
If atomic changes are committed and pushed (or at least the day's work
at the end of the day), I will be able to look at it often and pat your
back if the things are wonderful, marvelous, beyond the wildedst
immagination; or ask questions, seek clarification and discuss
directions if needed. Communication is the main challenge of any GSoC
endavour and git repository can help us to get it right.
Sometimes, GSoC students are scared that pushing publicly code of
questionable quality would be detrimental for them when a prospective
employer googles for their work. This is largely a myth and the evidence
is that if that was true, I would probably have to have spent all my
life living on social help :)

Happy hacking

Fridrich

On Sun, 2011-05-08 at 17:08 +0100, Tibby Lickle wrote:
> Hi,
> 
> Just an update on where I am. So far I've been working on the basics
> of extracting the data from the .vsd file.
> To read Visio files, the steps are roughly:
> 1. Get the interesting part ("VisioDocument") from the OLE container.
> 2. Parse the header to get a pointer to the trailer stream (as well as
> version, length of file, etc.)
> 3. Inflate compressed trailer.
> 4. Parse out pointers in trailer to the various - potentially
> compressed - streams that hold the actual Visio document content.
> 
> I've done 1 - 3. I'm using the WPXStream and its implementation from
> libwpd (WPXStreamImplementation.h here) to read/extract OLE streams.
> The implementation of LZW-esque decompression of the trailer is
> translated from Python to C++ (i.e. shamelessly ripped off) from
> oletoy (thanks frob). 
> I suspect most of what I'll be doing will be stand-alone for now -
> developing and debugging will be too slow if LO integration is
> included at this early stage. Once I've got a very basic parser, the
> cal

Re: [Libreoffice] [GSoc] Progress report - Visio import filter

2011-05-17 Thread Fridrich Strba
Eilidh,

I created a git repository
http://cgit.freedesktop.org/libreoffice/contrib/libvisio/

It has a skeleton of a library whose api is taking basically this class:

namespace libvisio
{
class VisioDocument
{
public:
static bool isSupported(WPXInputStream* input);
static bool parse(WPXInputStream* input,
libwpg::WPGPaintInterface* painter);
static bool generateSVG(WPXInputStream* input,
WPXString& output);
};
}

The isSupported function should take the input and judge whether it is a
Visio document that the library can handle. If so, it should return
true, otherwise false. This function is currently not doing anything and
returning false always.

The parse function takes the input and parses the document. As output it
calls the functions of painter. If the parse was successful, it should
return true, otherwise false. The function is currently not doing
anything and returning always false.

The generateSVG is a convenience function that is actually already
implemented by using the parse function itself and using the
VSDSVGGenerator class that is now part of the library. So, that one is
not even necessary to touch and the vsd2svg tool that lives in
src/conv/svg will use it to generate svg on stdout.

Another tool vsd2raw will only output the names of the callbacks and
their parameters, which is useful for debugging purposes, because it is
the most simple implementation of the libwpg::WPGPaintInterface.

Now, I also "implemented" inside LibreOffice master Visio import filter
that is able to use external libvisio.

So, it is enough to do following:

1) build your libvisio and install it (you might need to build also
system libwpd and libwpg if you are not running something sane like
openSUSE 11.4)
2) build LibreOffice master (if you manage) using --with-system-libwpd
--with-system-libwpg --with-system-libvisio
3) After the build finished, make dev-install the LibreOffice you just
built and copy the libvisioimport*.so somewhere to basis3.4/program
directory besides the libraries like libmsworks*.so or libwpgimport*.so.
If those libraries are links, you can link the libvisioimport*.so in teh
same way, but should not be something you will really need to touch at
this stage.
4) You can normally develop libvisio and keep installing it to the same
prefix and it should be used immediately by your version of LibreOffice.
5) As a quick hack you can view the documents that you partially convert
using the vsd2svg and any svg viewer as rsvg-view or even Firefox.

I hope this gets you started pretty fast. Please, keep present on irc as
much as you can :)

F.



___
LibreOffice mailing list
LibreOffice@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/libreoffice


Re: [Libreoffice] [GSoc] Progress report - Visio import filter

2011-05-16 Thread Fridrich Strba
Eilidh

On Sun, 2011-05-08 at 17:08 +0100, Tibby Lickle wrote:
> 1. Get the interesting part ("VisioDocument") from the OLE container.
> 2. Parse the header to get a pointer to the trailer stream (as well as
> version, length of file, etc.)
> 3. Inflate compressed trailer.
> 4. Parse out pointers in trailer to the various - potentially
> compressed - streams that hold the actual Visio document content.
> I've done 1 - 3. I'm using the WPXStream and its implementation from
> libwpd (WPXStreamImplementation.h here) to read/extract OLE streams.
> The implementation of LZW-esque decompression of the trailer is
> translated from Python to C++ (i.e. shamelessly ripped off) from
> oletoy (thanks frob).

Way too cool!


> I suspect most of what I'll be doing will be stand-alone for now -
> developing and debugging will be too slow if LO integration is
> included at this early stage. Once I've got a very basic parser, the
> callback interface discussed in my proposal will be implemented and
> integration with LO should in theory be relatively easy.

Now, what I would really love is if we could put some anchor in some git
space so that I could create for you a quick skeleton of a library and
the build system so that you would only do the fleshing out. Just create
an empty module "libvisio" somewhere (I checked and libvsd exists
already for some virtual server daemon). I will create a skeleton there
and we can work easily.

A week or two ago, I refactored the writerperfect module in libreoffice
tree so that it might be +/- trivial to add the support.
> 
> Note to my mentor -- I've got a paper due for next Saturday so my main
> focus will be on that. I will, however, be spending some time on the
> next stage.
> 
Hope everything went well.

F.

___
LibreOffice mailing list
LibreOffice@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/libreoffice


Re: [Libreoffice] [GSOC] Progress Report

2011-05-10 Thread Xisco Faulí
Hello there,

Finally I made yava2python run. I only needed to download txl from
http://www.txl.ca/
I've already tried to convert some files. It's not
straight, and in fact, it needs quite work but, I think it makes things much
easier. Thanks Michael.
Btw, I've just realized I only sent the last email to Andras, shit .

2011/5/9 Xisco Faulí 

> I think it could be  a good idea to use a python to java converter too but
> unfortunately I couldn't run any converter againt the code.
> First I tried with java2python but it doesn't work with java 5 grammer. I
> suppose java2python with j2p is the way to make it works, but I couldn't
> make them run. Then I found out another coverter called yava2python but
> everytime I run it, it says: txl: not found.
> Today I didn't have too much time so I'll try it again tomorrow. Anyway, if
> anybody can give me a hand with it, i'll appreciate it.
> Btw, I couldn't reach Björn today, he's in a conference, so the decision of
> using a java to python converter is completely up to him.
> The same about generating fax templates on runtime. From my point of view,
> I'd try to get the wizards working on python first and then, make it
> possible to create the templates on runtime, but as I said, this isn't up to
> me.
>
> Greetings
>
> 2011/5/9 Andras Timar 
>
>> Hi,
>>
>> 2011/5/8 Xisco Faulí :
>> > My task during the GSOC period is going to be the conversion of the
>> Wizard
>> > menus into python.
>> > I met Björn Michäelsen last monday and we decided, on broad lines, the
>> > course to follow. First I'll create a basic fax design using pyuno, then
>> > I'll make it customized and finally I'll create the GUI. Once the fax
>> wizard
>> > is done, I'll move on with the other wizards.
>>
>> I wonder if it would be possible to generate fax templates runtime
>> from the locale data (page size, date format) and from localized
>> strings. This way we could get rid of the numerous template files in
>> our packages, and we could have them localized easier. I know it's out
>> of scope of the original project. :) I just wanted to let you know
>> about this. I can help you in all localization related problems.
>>
>> Best regards,
>> Andras
>>
>
>
___
LibreOffice mailing list
LibreOffice@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/libreoffice


Re: [Libreoffice] [GSOC] Progress Report

2011-05-09 Thread Andras Timar
Hi,

2011/5/8 Xisco Faulí :
> My task during the GSOC period is going to be the conversion of the Wizard
> menus into python.
> I met Björn Michäelsen last monday and we decided, on broad lines, the
> course to follow. First I'll create a basic fax design using pyuno, then
> I'll make it customized and finally I'll create the GUI. Once the fax wizard
> is done, I'll move on with the other wizards.

I wonder if it would be possible to generate fax templates runtime
from the locale data (page size, date format) and from localized
strings. This way we could get rid of the numerous template files in
our packages, and we could have them localized easier. I know it's out
of scope of the original project. :) I just wanted to let you know
about this. I can help you in all localization related problems.

Best regards,
Andras
___
LibreOffice mailing list
LibreOffice@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/libreoffice


Re: [Libreoffice] [GSOC] Progress Report

2011-05-09 Thread Michael Meeks
Hi Xisco,

On Sun, 2011-05-08 at 23:02 +0200, Xisco Faulí wrote:
> My task during the GSOC period is going to be the conversion of the
> Wizard menus into python.

Nice :-)

> I met Björn Michäelsen last monday and we decided, on broad lines, the
> course to follow. First I'll create a basic fax design using pyuno,
> then I'll make it customized and finally I'll create the GUI. Once the
> fax wizard is done, I'll move on with the other wizards.

Ooh :-) so - I had always assumed that there is probably a -lot- of
mileage in using an automatic Java -> Python converter. This is because
our Java code uses (primarily) OO.o / UNO APIs that should map to python
quite nicely (perhaps with some converter tweaks). Our UNO / Abstract
Windowing Toolkit ('awt') APIS are used instead of the native Java AWT
eg.

> This week I've been studying the java code in order to see what things
> do and playing around a bit with Pyuno and the GUI. I've tried to
> create some text components and some GUI dialogs. So far it's going
> good although I've had problem with a few things I have to take a
> deeper look.

Great :-) My problem was that ooinstall doesn't seem to install the
python scripting extension:

solver/300/unxlngi6.pro/bin/script-provider-for-python.oxt

which is deadly annoying ;-) but anyhow; here are some links:

http://code.google.com/p/j2p/
http://code.google.com/p/java2python/
http://debedb.blogspot.com/2007/03/java-to-python-converter.html

Personally, I would prefer an automated conversion to a manual one: not
only because it should go quicker, but also be more reliable (in theory)
- and hopefully more re-usable for other Java-bits we have.

But of course Bjoern is your mentor :-) you need to please him.

ATB,

Michael.

-- 
 michael.me...@novell.com  <><, Pseudo Engineer, itinerant idiot


___
LibreOffice mailing list
LibreOffice@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/libreoffice


Re: [Libreoffice] [GSOC] Progress Report

2011-05-09 Thread Cedric Bosdonnat
Hi Xisco,

On Sun, 2011-05-08 at 23:02 +0200, Xisco Faulí wrote:
> This week I've been studying the java code in order to see what things
> do and playing around a bit with Pyuno and the GUI. I've tried to
> create some text components and some GUI dialogs. So far it's going
> good although I've had problem with a few things I have to take a
> deeper look.

When you'll come to the letter creation wizard, could you please take
into account that the elements in a letter can vary depending on the
locale? I don't know whether keeping this wizard is meaningful at all...
but the existing one isn't correct for use by french people at least.

Regards,

-- 
Cédric Bosdonnat
LibreOffice hacker
http://documentfoundation.org
OOo Eclipse Integration developer
http://cedric.bosdonnat.free.fr

___
LibreOffice mailing list
LibreOffice@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/libreoffice


[Libreoffice] [GSOC] Progress Report

2011-05-08 Thread Xisco Faulí
Hello everybody,

My task during the GSOC period is going to be the conversion of the Wizard
menus into python.
I met Björn Michäelsen last monday and we decided, on broad lines, the
course to follow. First I'll create a basic fax design using pyuno, then
I'll make it customized and finally I'll create the GUI. Once the fax wizard
is done, I'll move on with the other wizards.
This week I've been studying the java code in order to see what things do
and playing around a bit with Pyuno and the GUI. I've tried to create some
text components and some GUI dialogs. So far it's going good although I've
had problem with a few things I have to take a deeper look.

Greetings
___
LibreOffice mailing list
LibreOffice@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/libreoffice


[Libreoffice] [GSoc] Progress report - Visio import filter

2011-05-08 Thread Tibby Lickle
Hi,

Just an update on where I am. So far I've been working on the basics of
extracting the data from the .vsd file.
To read Visio files, the steps are roughly:
1. Get the interesting part ("VisioDocument") from the OLE container.
2. Parse the header to get a pointer to the trailer stream (as well as
version, length of file, etc.)
3. Inflate compressed trailer.
4. Parse out pointers in trailer to the various - potentially compressed -
streams that hold the actual Visio document content.

I've done 1 - 3. I'm using the WPXStream and its implementation from libwpd
(WPXStreamImplementation.h
here)
to read/extract OLE streams.  The implementation of LZW-esque decompression
of the trailer is translated from Python to C++ (i.e. shamelessly ripped
off) from oletoy (thanks frob).
I suspect most of what I'll be doing will be stand-alone for now -
developing and debugging will be too slow if LO integration is included at
this early stage. Once I've got a very basic parser, the callback interface
discussed in my proposal will be implemented and integration with LO should
in theory be relatively easy.

Note to my mentor -- I've got a paper due for next Saturday so my main focus
will be on that. I will, however, be spending some time on the next stage.

Eilidh
___
LibreOffice mailing list
LibreOffice@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/libreoffice