Re: out of memory error when creating pptx file

2020-02-04 Thread Rob Sargent



> On Feb 4, 2020, at 1:32 AM, jaehoon jeong  wrote:
> 
> Thank you for your reply.
> 
> I can use about 4g of memory. But I have to handle multiple requests.
Memory is remarkably inexpensive compared to your time and your sanity
> For example, if three requests come in at the same time, an out of memory 
> error occurs.
> 
> One request takes about 30 seconds.
> 
Are you automatically generating 600slides per online request. Is the slideshow 
a report?
> Saving each of the 100 slides and then merging them also uses a lot of memory.
> 
> Can xerces merge individual pptx files without consuming too much memory?

Xerces is an XML tool. Perhaps you can stream edit a template to generate the 
final show. 
>> On 2020/02/04 02:30:43, Rob Sargent  wrote: 
>> 
>> 
>>> On 2/3/20 7:20 PM, jaehoon jeong wrote:
>>> Thank you for your reply.
>>> 
>>> I am using oracle jdk1.8. But I do not want to increase heap memory.
>>> I want to find a solution in another way.
>> How little memory must this fit in?
>>> 
>>> What does "chpaters" you mean?
>>> For example, does it mean to divide into 100 slides and save each file?
>>> 
>>> I do not want to save as multiple files.
>>> I want to save all slides in one file.
>>> 
>>> For example, I'm considering flushing slides to a file every 100 slides 
>>> created.
>>> Is this possible using the poi library?
>> How far in to 600 do you get?
>> I'm not sure but to save each hundred slides would be to write a new 
>> workbook.  You would then have to merge those into a single workbook via 
>> poi unless you are prepared to merge the underlying xml files? That 
>> approach would certainly be more memory efficient but more a xerces 
>> problem than a poi problem.
>>> 
>>> On 2020/02/03 14:56:13, Rob Sargent  wrote:
 Since no one really wants to sit through 600 slides, break the total up in 
 “chapters” and see where that gets you?
 
 How much memory does you machine have, which version of java are you using.
 
> On Feb 3, 2020, at 7:15 AM, jaehoon jeong  wrote:
> 
> Hello
> 
> I'm trying to generate a pptx file using the poi library.
> The XMLSlideShow class contains about 600 XSLFSlides.
> Each XSLFSlide object uses about 3mb of memory. 1.8gb of memory is 
> required to create one pptx file.
> 
> It uses out too much memory, causing an out of memory error.
> And I can't increase the heap memory size.
> 
> Is there a way to save slides in multiple times in one pptx file?
> Is there a way to reduce memory usage?
> 
> Thanks in advance.
> 
> -
> To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
> For additional commands, e-mail: user-h...@poi.apache.org
> 
 
 -
 To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
 For additional commands, e-mail: user-h...@poi.apache.org
 
 
>>> -
>>> To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
>>> For additional commands, e-mail: user-h...@poi.apache.org
>>> 
>> 
>> 
>> -
>> To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
>> For additional commands, e-mail: user-h...@poi.apache.org
>> 
>> 
> 
> -
> To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
> For additional commands, e-mail: user-h...@poi.apache.org
> 

-
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org



Re: out of memory error when creating pptx file

2020-02-04 Thread jaehoon jeong
Thank you for your reply.

The 600 slides have a similar structure.

Creating an xml file with template information and an xml file with data seems 
difficult.

But suppose I created each xml file.
How do I map the data?

I will parse the repetitive data xml and map it to a slides object. In the end, 
it will use a lot of memory.

Is there a way to make two xml files into one pptx file without using a lot of 
memory?

On 2020/02/03 19:21:51, Andreas Beeker  wrote: 
> Hi,
> 
> XSLF and XmlBeans are undoubtedly memory hogs, but I can't solve this in a 
> short period -
> I often thought about an internal model which we could use for XSLF / HSLF 
> but lets get back to your issue.
> 
> So when you say, you need to generate 600 slides ... does that mean they have 
> similar structure?
> i.e. could you use a template mechanism?
> 
> If true, I would generate one set of slides and then copy and fill those with 
> the repetitive data outside of POI.
> So handling the two XML files (slide + .refs) and adding it to the zip file 
> is no magic.
> Apart of the slides the presentation.xml and its .refs need also to be 
> modified.
> 
> Andi
> 
> 
> 
> On 03.02.20 15:15, jaehoon jeong wrote:
> > Hello
> >
> > I'm trying to generate a pptx file using the poi library.
> > The XMLSlideShow class contains about 600 XSLFSlides.
> > Each XSLFSlide object uses about 3mb of memory. 1.8gb of memory is required 
> > to create one pptx file.
> >
> > It uses out too much memory, causing an out of memory error.
> > And I can't increase the heap memory size.
> >
> > Is there a way to save slides in multiple times in one pptx file?
> > Is there a way to reduce memory usage?
> >
> > Thanks in advance.
> >
> > -
> > To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
> > For additional commands, e-mail: user-h...@poi.apache.org
> >
> 
> 
> 

-
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org



Re: out of memory error when creating pptx file

2020-02-04 Thread jaehoon jeong
Thank you for your reply.

I can use about 4g of memory. But I have to handle multiple requests.
For example, if three requests come in at the same time, an out of memory error 
occurs.

One request takes about 30 seconds.

Saving each of the 100 slides and then merging them also uses a lot of memory.

Can xerces merge individual pptx files without consuming too much memory?

On 2020/02/04 02:30:43, Rob Sargent  wrote: 
> 
> 
> On 2/3/20 7:20 PM, jaehoon jeong wrote:
> > Thank you for your reply.
> >
> > I am using oracle jdk1.8. But I do not want to increase heap memory.
> > I want to find a solution in another way.
> How little memory must this fit in?
> >
> > What does "chpaters" you mean?
> > For example, does it mean to divide into 100 slides and save each file?
> >
> > I do not want to save as multiple files.
> > I want to save all slides in one file.
> >
> > For example, I'm considering flushing slides to a file every 100 slides 
> > created.
> > Is this possible using the poi library?
> How far in to 600 do you get?
> I'm not sure but to save each hundred slides would be to write a new 
> workbook.  You would then have to merge those into a single workbook via 
> poi unless you are prepared to merge the underlying xml files? That 
> approach would certainly be more memory efficient but more a xerces 
> problem than a poi problem.
> >
> > On 2020/02/03 14:56:13, Rob Sargent  wrote:
> >> Since no one really wants to sit through 600 slides, break the total up in 
> >> “chapters” and see where that gets you?
> >>
> >> How much memory does you machine have, which version of java are you using.
> >>
> >>> On Feb 3, 2020, at 7:15 AM, jaehoon jeong  wrote:
> >>>
> >>> Hello
> >>>
> >>> I'm trying to generate a pptx file using the poi library.
> >>> The XMLSlideShow class contains about 600 XSLFSlides.
> >>> Each XSLFSlide object uses about 3mb of memory. 1.8gb of memory is 
> >>> required to create one pptx file.
> >>>
> >>> It uses out too much memory, causing an out of memory error.
> >>> And I can't increase the heap memory size.
> >>>
> >>> Is there a way to save slides in multiple times in one pptx file?
> >>> Is there a way to reduce memory usage?
> >>>
> >>> Thanks in advance.
> >>>
> >>> -
> >>> To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
> >>> For additional commands, e-mail: user-h...@poi.apache.org
> >>>
> >>
> >> -
> >> To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
> >> For additional commands, e-mail: user-h...@poi.apache.org
> >>
> >>
> > -
> > To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
> > For additional commands, e-mail: user-h...@poi.apache.org
> >
> 
> 
> -
> To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
> For additional commands, e-mail: user-h...@poi.apache.org
> 
> 

-
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org



Re: Debugging tip for "Excel cannot open the file" ... when opening the file created by POI OOXML

2020-02-04 Thread Dominik Stadler
Hi,

Small addition: For comparing two .xlsx files we have a tool in the
dev-sources called OOXMLPrettyPrint which allows to reformat all the XML
inside the .xlsx so files created by POI and ones written by Excel look as
similar as possible for doing text/file-comparison between them. This can
make finding differences much easier.

Dominik.

On Tue, Feb 4, 2020 at 12:46 AM Kuro Kurosaka 
wrote:

> I thought I created a test case by calling the APIs exactly as it does on
> the app server, but it didn't quite work.
> The generated .xlsx file opens successfully even when I used POI 3.17.
> I will create a new issue if I find a way to reproduce this consistently.
>
> I have been calling XSSFWorkbook.close() for POI 3.17. WIth POI
> 3.10.1, XSSFWorkbook
> doesn't have a close() method.
>
>
>
> On Mon, Feb 3, 2020 at 3:09 PM Jörn Franke  wrote:
>
> > Can you share the code of the unit test?
> >
> > Maybe the file is not properly closed on the application server or there
> > is an unlogged exception.
> >
> > > Am 03.02.2020 um 23:41 schrieb Kuro Kurosaka <
> > k...@spartansoftwareinc.com>:
> > >
> > > I've read this issue
> > > https://bz.apache.org/bugzilla/show_bug.cgi?id=59738
> > > which suggests version 3.10.1 works. And I tried this version and the
> > > problem is gone!
> > >
> > > Should I re-open this issue, if I can find a way to reproduce it?
> > >
> > >> On Mon, Feb 3, 2020 at 1:39 PM Kuro Kurosaka <
> > k...@spartansoftwareinc.com>
> > >> wrote:
> > >>
> > >> OLE2 is a Windows file format, isn't it? I'm on Mac/Linux, and even if
> > the
> > >> validator exists and runs for .xlsx, I can't run it. (I'm assuming
> it's
> > an
> > >> .exe file.)
> > >> The link to the article is broken also.
> > >>
> > >> I'll try to find if there's any code closer to validator in POI
> source.
> > >>
> > >> Thank you for mentioning .xsb files. They exist in the shaded jar but
> > they
> > >> weren't relocated. And if I did relocate them, I'm guessing there
> would
> > be
> > >> lots of file-not-found exceptions. I temporarily stopped relocation.
> But
> > >> that didn't improve the situation.
> > >>
> > >> On Mon, Feb 3, 2020 at 12:55 PM Andreas Beeker 
> > >> wrote:
> > >>
> > >>> We have two entries in the FAQ [1] about file validation, which I
> > haven't
> > >>> used myself yet ... and probably are futile in your case.
> > >>> You can try to validate against the ECMA 376 schemes.
> > >>>
> > >>> If I have similar problems I try to go step-wise from the simple case
> > to
> > >>> the complex ...
> > >>> and yes, it's sometimes quite time consuming.
> > >>>
> > >>> Can you try your shaded jar in the unit test? ... my guess is, it
> might
> > >>> not include all XmlBeans files (*.xsb)
> > >>>
> > >>> [1] https://poi.apache.org/help/faq.html
> > >>>
> > >>> On 03.02.20 21:25, Kuro Kurosaka wrote:
> >  The .xlsx file from the test run has the same file structure as the
> > >>> .xlsx
> >  from the real run that doesn't open.
> >  The JAR I upload to an application server is shaded and includes the
> > POI
> >  library that is relocated to its version
> >  specific packages to avoid collision. So it's the same version of
> POI
> > as
> >  the test run.
> > 
> >  If there is no better way, I could somehow record all POI calls in
> the
> > >>> real
> >  run, and ptu it to the unit test,
> >  but I'd rather want to avoid this route as it is very time
> consuming.
> > I
> > >>> am
> >  hoping there's a way for
> >  Excel to tell me what errors it is seeing.
> > >>>
> > >>>
> > >>>
> > >>
> > >> --
> > >> T. Kuro Kurosaka, Software Engineer, Spartan Software Inc.
> > >>
> > >>
> > >
> > > --
> > > T. Kuro Kurosaka, Software Engineer, Spartan Software Inc.
> >
> > -
> > To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
> > For additional commands, e-mail: user-h...@poi.apache.org
> >
> >
>
> --
> T. Kuro Kurosaka, Software Engineer, Spartan Software Inc.
>