Re: [dev] scripted multiplatform .doc to .html conversion

2006-05-02 Thread Andreas Höhmann
Kirk Israel wrote:
 
 And in terms of expanding on that so that it have .doc as input (right now
 it seems to only accept .odt) and HTML as output (currently not one of the
 options listed in the program), are there any gotchas I should know about or
 is it just about finding some appropriate API documentation and doing the
 fairly obvious things?
 

have a look at jooconvert (http://jooreports.sourceforge.net/)

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [dev] scripted multiplatform .doc to .html conversion

2006-04-20 Thread Tom Schindl
Hi Kirk,

No you simply have to discover the right filters which have to be used ;-)

A more appropriate place to ask is:
- dev@api.openoffice.org
- http://api.openoffice.org/DevelopersGuide/DevelopersGuide.html

But once more if you plan to use this snippet in a multi-threaded
environment like your J2EE-Server you need to serialize access in your
application or have to have a pool of OO-Instances and dispatch
conversion process to one of them.

One more thing, I don't think that you need to have
OOo/programm-Directory in your class-path OpenOffice 2 libs should
locate the soffice.bin itself but I could be mistaken here.

Tom

Kirk Israel wrote:
 Tom,
 thanks, that is very cool.
 I was able to get the snippet up and running...
 Through trial and error I got the correct Jars from my OOo directory I
 needed to compile against,
 and then Google indicated I needed to include the OOo/program directory in
 the classpath.
 
 Was there a smarter way I should have know the above?
 
 And in terms of expanding on that so that it have .doc as input (right now
 it seems to only accept .odt) and HTML as output (currently not one of the
 options listed in the program), are there any gotchas I should know about or
 is it just about finding some appropriate API documentation and doing the
 fairly obvious things?
 
 This was a great first step, many thanks!
 -Kirk
 
 
 
 
 On 4/19/06, Tom Schindl [EMAIL PROTECTED] wrote:
 
Hi,

there's a fully functional codesnippet available which does show how
document-conversion can happen.


http://codesnippets.services.openoffice.org/Office/Office.ConvertDocuments.snip

If you are running this from a J2EE application you need to take into
consideration that ***one*** OO-Instance can not deal with multiple
request at the same time, so must:
- serialize access to OO
- create a pool of instances you connect to and serialize access to them

Tom

Kirk Israel wrote:

This project got backburnered but is now coming up again, the concept of
integrating OOo's doc to HTML conversion as seamlessly as possible into

an

exisint J2EE application.

My understanding is that OOo must be present (copied to, but not

necceaarily

installed, baed on Mathias' previous comments. At that point, it should

be

fairly easy to go through with the UNO libraries...is that about the

size of

it? Am I missing anything, or are there any resources that might make

this

easier?

Thanks,
Kirk





 



Re: [dev] scripted multiplatform .doc to .html conversion

2006-04-19 Thread Tom Schindl
Hi,

there's a fully functional codesnippet available which does show how
document-conversion can happen.

http://codesnippets.services.openoffice.org/Office/Office.ConvertDocuments.snip

If you are running this from a J2EE application you need to take into
consideration that ***one*** OO-Instance can not deal with multiple
request at the same time, so must:
- serialize access to OO
- create a pool of instances you connect to and serialize access to them

Tom

Kirk Israel wrote:
 This project got backburnered but is now coming up again, the concept of
 integrating OOo's doc to HTML conversion as seamlessly as possible into an
 exisint J2EE application.
 
 My understanding is that OOo must be present (copied to, but not necceaarily
 installed, baed on Mathias' previous comments. At that point, it should be
 fairly easy to go through with the UNO libraries...is that about the size of
 it? Am I missing anything, or are there any resources that might make this
 easier?
 
 Thanks,
 Kirk
 




Re: [dev] scripted multiplatform .doc to .html conversion

2006-04-19 Thread Kirk Israel
Tom,
thanks, that is very cool.
I was able to get the snippet up and running...
Through trial and error I got the correct Jars from my OOo directory I
needed to compile against,
and then Google indicated I needed to include the OOo/program directory in
the classpath.

Was there a smarter way I should have know the above?

And in terms of expanding on that so that it have .doc as input (right now
it seems to only accept .odt) and HTML as output (currently not one of the
options listed in the program), are there any gotchas I should know about or
is it just about finding some appropriate API documentation and doing the
fairly obvious things?

This was a great first step, many thanks!
-Kirk




On 4/19/06, Tom Schindl [EMAIL PROTECTED] wrote:

 Hi,

 there's a fully functional codesnippet available which does show how
 document-conversion can happen.


 http://codesnippets.services.openoffice.org/Office/Office.ConvertDocuments.snip

 If you are running this from a J2EE application you need to take into
 consideration that ***one*** OO-Instance can not deal with multiple
 request at the same time, so must:
 - serialize access to OO
 - create a pool of instances you connect to and serialize access to them

 Tom

 Kirk Israel wrote:
  This project got backburnered but is now coming up again, the concept of
  integrating OOo's doc to HTML conversion as seamlessly as possible into
 an
  exisint J2EE application.
 
  My understanding is that OOo must be present (copied to, but not
 necceaarily
  installed, baed on Mathias' previous comments. At that point, it should
 be
  fairly easy to go through with the UNO libraries...is that about the
 size of
  it? Am I missing anything, or are there any resources that might make
 this
  easier?
 
  Thanks,
  Kirk
 






Re: [dev] scripted multiplatform .doc to .html conversion

2005-12-13 Thread Mathias Bauer
Kirk Israel wrote:

 Sorry, I'm not being willfully dense here...I understand that if I'm
 doing this through the API, there has to be an instance of OOo
 running, but are you saying that the segment of the source responsible
 for reading in Doc (and the other segment, reseponsible for spitting
 out HTML) is so tightly coupled with the rest of the system as a whole
 that extracting those two segments isn't feasible, that saying aha,
 THIS is the conversion function wouldn't get you anywhere, because it
 depends on so much other stuff working to run?

I think you have a misconception how document conversion in OOo works.
There is no direct translation between input and output format, input
filters always convert the input format into a representation in memory
(the core of a document) and the output filter converts this into the
output format. If you think about this a little bit you will see that
anything else doesn't make sense, at the end OOo is an application and
not a conversion service: why should there be code that directly
translates from e.g. doc to html? OOo itself doesn't need such code.

So it will never make sense to isolate the filter code, you always also
need the code of the document core also. Theoretically it is possible to
take the code of the filters and the core and make it a smaller package
but until now nobody needed something like this so very badly that he
started the work to create such an environment. You will need a kind of
an application anyway and you will need UNO and its bootstrapping, you
will need some of the services in OOo used by the filters etc.

So it's possible but quite some work to do and all you earn from the
work to make it happen would be that you safe some MB on disk.
Is that worth the effort?

BTW: you don't need an *installed* version of OOo on your machine, it's
enough to have a runnable *copy* (though in this case you have to create
each UNO connection manually because your system doesn't provide a hint
where the OOo installation is).

Best regards,
Mathias

-- 
Mathias Bauer - OpenOffice.org Application Framework Project Lead
Please reply to the list only, [EMAIL PROTECTED] is a spam sink.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [dev] scripted multiplatform .doc to .html conversion

2005-12-13 Thread Kirk Israel
Mathias, thank you for your feedback...I have a few responses.

 I think you have a misconception how document conversion in OOo works.
 There is no direct translation between input and output format, input
 filters always convert the input format into a representation in memory
 (the core of a document) and the output filter converts this into the
 output format. If you think about this a little bit you will see that
 anything else doesn't make sense, at the end OOo is an application and
 not a conversion service: why should there be code that directly
 translates from e.g. doc to html? OOo itself doesn't need such code.

I assumed that it would be a doc to internal unmarshalling followed
by a internal to HTML unmarshalling, for obvious reasons (like need
2n filters rather n!)...I guess I was envisioning a small(ish) bit of
code that would do something like (in pseudojava)

Document doc = OOoUtils.getDocument(HTML_CONVERTER,somefile.html);
OOoUtils.writeDocument(DOC_CONVERTER,doc,output.doc);

maybe with some Input/Output Streams or services instead, but that's
the general jist.

 So it will never make sense to isolate the filter code, you always also
 need the code of the document core also. Theoretically it is possible to
 take the code of the filters and the core and make it a smaller package
 but until now nobody needed something like this so very badly that he
 started the work to create such an environment. You will need a kind of
 an application anyway and you will need UNO and its bootstrapping, you
 will need some of the services in OOo used by the filters etc.

I see what you're getting at, the conversion process isn't
self-contained but dependent on a series of services, strucutres, and
what not.

Just by reading some recent archives of this list, I'd say this kind
of scripting is fairly sought after...but maybe the people who want to
cherrypick the functionality aren't the same kind of people willing to
put in the work to make it an isolated tool.

 So it's possible but quite some work to do and all you earn from the
 work to make it happen would be that you safe some MB on disk.
 Is that worth the effort?

Quite possibly not...I think it was a desire for more easily embedding
installation of just the conversion stuff rather than having OOo be
a seperate install. If you could easily embed just a few filters and
some supporting classes at the source code level into a larger
project, that would make it more transparent to the user.

 BTW: you don't need an *installed* version of OOo on your machine, it's
 enough to have a runnable *copy* (though in this case you have to create
 each UNO connection manually because your system doesn't provide a hint
 where the OOo installation is).

Aha, good to know.

 Best regards,
 Mathias

Thank you!
Kirk

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [dev] scripted multiplatform .doc to .html conversion

2005-12-09 Thread Jürgen Schmidt

Hi Krik,

take a look into the SDK example java\DocumentHandling\DocumentConverter
you can easy implement a Java remote client application doing the 
conversion for you. But you always need an installed office working as a 
server (for example with UI if necessary)


Juergen

Kirk Israel wrote:

So the folks at my new job decided to really give me a trial by
fire...they'd like me to outline a clear and detailed outline of how
to include .doc to .html conversion in our product, in an automated
kind of way.

Openoffice seems to handle the basic task gracefully through the UI. 
Can anyone tell me if there's a commandline version that would enable

this from the commandline?  Or, possibly even better, is there a
specific callable module responsible for this, is there an
intermediate in-memory format that can be marshalled/unmarshalled with
the various file formats?

I'm at a bit of a loss to know where to start code diving...would it
be a better idea for a n00b to start using the CVS feed, or is there a
downloadable archive lurking around on one of the websites?

Thanks for any and all advice!  I'm really in dire straits here, so
suggestions are acts of mercy...

-Kirk

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [dev] scripted multiplatform .doc to .html conversion

2005-12-09 Thread Kirk Israel
On 12/9/05, Laurent Godard [EMAIL PROTECTED] wrote:

 you may have a look at this, for a very frist shoot
 http://oooconv.free.fr/oooconv/oooconv_en.html

So that's a webpage in PHP, and macro for use in an existing instance
of OOo, making a web application for that kind of conversion?

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [dev] scripted multiplatform .doc to .html conversion

2005-12-09 Thread Kirk Israel
On 12/9/05, Jürgen Schmidt [EMAIL PROTECTED] wrote:
 Hi Krik,

 take a look into the SDK example java\DocumentHandling\DocumentConverter
 you can easy implement a Java remote client application doing the
 conversion for you. But you always need an installed office working as a
 server (for example with UI if necessary)

Hmm. Is your feeling then, that just the document functionality
might too difficult to extract on a source code level?

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [dev] scripted multiplatform .doc to .html conversion

2005-12-09 Thread Jürgen Schmidt

Kirk Israel wrote:

On 12/9/05, Jürgen Schmidt [EMAIL PROTECTED] wrote:


Hi Krik,

take a look into the SDK example java\DocumentHandling\DocumentConverter
you can easy implement a Java remote client application doing the
conversion for you. But you always need an installed office working as a
server (for example with UI if necessary)



Hmm. Is your feeling then, that just the document functionality
might too difficult to extract on a source code level?


Yes exactly, the current architecture doesn't allow to extract only this 
small part. Maybe it will be possible some time in the future ;-)
For using the API you need always a runnig office instance. The other 
possiblity is to work directly on the xml file format and work with XSL 
transformations but that of course is not possible for most of binary 
formats.


Juergen



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [dev] scripted multiplatform .doc to .html conversion

2005-12-09 Thread Laurent Godard

Hi kirk



So that's a webpage in PHP, and macro for use in an existing instance
of OOo, making a web application for that kind of conversion?



a 2 1/2 year old first shoot
to give some ideas
better can be done of course (and i will release when time a tool like 
this based on python, xml-rpc  OOo)


Laurent

--
Laurent Godard [EMAIL PROTECTED] - Ingénierie OpenOffice.org
Indesko  http://www.indesko.com
Nuxeo CPS  http://www.nuxeo.com - http://www.cps-project.org
Livre Programmation OpenOffice.org, Eyrolles 2004

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [dev] scripted multiplatform .doc to .html conversion

2005-12-09 Thread Kirk Israel
On 12/9/05, Jürgen Schmidt [EMAIL PROTECTED] wrote:
 Kirk Israel wrote:
  On 12/9/05, Jürgen Schmidt [EMAIL PROTECTED] wrote:
 
  Hmm. Is your feeling then, that just the document functionality
  might too difficult to extract on a source code level?

 Yes exactly, the current architecture doesn't allow to extract only this
 small part. Maybe it will be possible some time in the future ;-)
 For using the API you need always a runnig office instance. The other
 possiblity is to work directly on the xml file format and work with XSL
 transformations but that of course is not possible for most of binary
 formats.

Sorry, I'm not being willfully dense here...I understand that if I'm
doing this through the API, there has to be an instance of OOo
running, but are you saying that the segment of the source responsible
for reading in Doc (and the other segment, reseponsible for spitting
out HTML) is so tightly coupled with the rest of the system as a whole
that extracting those two segments isn't feasible, that saying aha,
THIS is the conversion function wouldn't get you anywhere, because it
depends on so much other stuff working to run?

Dang, if that IS the case my manager isn't going to like that I'm
shooting down the team's preferred cool new idea :-)

Thanks,
Kirk

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [dev] scripted multiplatform .doc to .html conversion

2005-12-08 Thread Laurent Godard

Hi

Openoffice seems to handle the basic task gracefully through the UI. 
Can anyone tell me if there's a commandline version that would enable

this from the commandline?  Or, possibly even better, is there a
specific callable module responsible for this, is there an
intermediate in-memory format that can be marshalled/unmarshalled with
the various file formats?



you may have a look at this, for a very frist shoot
http://oooconv.free.fr/oooconv/oooconv_en.html

Laurent

--
Laurent Godard [EMAIL PROTECTED] - Ingénierie OpenOffice.org
Indesko  http://www.indesko.com
Nuxeo CPS  http://www.nuxeo.com - http://www.cps-project.org
Livre Programmation OpenOffice.org, Eyrolles 2004

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]