Re: Apache POI functionality breaks in JAVA 17

2023-01-05 Thread Nick Burch

On Thu, 5 Jan 2023, Dhaval Kaushik wrote:

Also, this error only comes on using tomcat and java 17 version. The
apache-poi version is 3.14. On upgrading to a higher version it breaks some
existing code which is not correct. So is apache-poi not compatible with
java 17 in tomcat?


Apache POI 3.14 was released in 2016 - 
https://poi.apache.org/devel/history/changes-3x.html#3.14


The fact that a version that is almost 7 years old, and predates Java 17 
by almost as long, doesn't work perfectly isn't a huge surprise. You need 
to use a more modern release


The latest Apache POI release is 5.2.3, you should try that. There are a 
lot of bug fixes in the last 7 years, see https://poi.apache.org/changes.html 
for a summary of the recent ones


Nick

-
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org



Re: font problem or wrong usage

2022-09-09 Thread Nick Burch

On Fri, 9 Sep 2022, Peter Busfy wrote:
This class (StyleCreator) where I hold all styles, are created per 
workbook. And all styles are in this StyleCreator class initialized only 
once.


Hmm, that's the most obvious cause excluded

For a workbook where one sheet is happy and one sheet isn't:
* Take a copy of the file
* Open the copy in Excel, fix the first few rows manually, and save
* Unzip both the original and copy
* In the POI generated file, compare the sheet XML for the "works" sheet
  with the "broken" sheet, especially around style attributes
* Then compare the "broken" sheet XML with the "Excel fixed", for the same
  bits

Hopefully one of those will give a hint about what's going wrong

Nick

-
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org



Re: font problem or wrong usage

2022-09-09 Thread Nick Burch

On Fri, 9 Sep 2022, Peter Busfy wrote:
I started using POI recently. And I ran to one interesting issue with 
fonts. I have one dedicated class where I store all styles which I use 
in my project. Here in this class, I also hold font. So far, I just need 
only one font in whole project. So, I initialized this one font, only 
once, and stored it as a class variable. And then I used this font in 
all style's initializations.


Are you caching the styles too? Styles are per-workbook.

Surprisingly it stared causes quite unpredictable behavior. In some 
sheets it works with any problems, but in some other issues occurred.


If you're creating a fresh style every time, you're probably just running 
out of styles and Excel is ignoring all the ones past its max.


Re-use styles across cells and sheets and you won't run out, then Excel 
will pay attention to all of them


Nick

-
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org



Re: Thread Safety of sheet operations

2022-05-19 Thread Nick Burch

On Mon, 16 May 2022, Andreas Reichel wrote:

Of course I am fully aware that POI is not thread safety on workbook
level and threads can only be used as long as every sheet is processed
within its own thread (without altering styles).


We only ensure that you are safe if each workbook is created in its own 
thread, we generally advise against having multiple threads working on the 
same workbook as it often goes wrong. It sounds like you've pushed things 
well beyond our normally recommended limits already...



So far so good.

I only wonder, if POI should have blocked Sheet sheet =
workbook.createSheet(p[4]); by itself, e. g. using a semaphore and
print a warning.


Won't that slow down the 99.9% case, where people are just using a single 
thread for the whole workbook? And doesn't it risk people trying to do 
things that aren't generally thread-safe because "that bit seemed fine" 
and getting stuck later?


Nick

-
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org

Re: org.apache.poi.xwpf.usermodel.XWPFDocument Set start/end Page to extract text

2021-10-23 Thread Nick Burch

On Sat, 23 Oct 2021, nskarthik wrote:

Process : POI 4.1.2 ,jdk15 ,win10


You should consider upgrading to Apache POI 5.0 - there have been quite a 
few fixes since then, see http://poi.apache.org/changes.html#5.0.0



Question : Extract Text only from MS-Word docx/doc from specific
pages  ( Start Page / End Page ) defined.


Not possible. The pages aren't stored in the Word file formats. Your only 
way to know what is on each page is to render it, which we don't support


Nick

-
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org



Re: Reading Massive Excel Files to csv

2021-05-03 Thread Nick Burch

On Mon, 3 May 2021, Oscar Bastidas wrote:

Thanks Nick for the updated link.


I've updated the link in the sources for the site, hopefully one of the 
other devs who has all of the tools installed can republish shortly



With your comment on moving to Gradle, does this mean there will be a time
when Apache POI will not be available on Maven?


Nope! Will always be available on Maven central, for users of Maven, 
Gradle or Ivy.


This only affects people trying to build Apache POI from source, and 
hopefully makes their lives easier than before when we used Ant.


Nick

-
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org



Re: Reading Massive Excel Files to csv

2021-05-03 Thread Nick Burch

On Mon, 3 May 2021, Oscar Bastidas wrote:
I am trying to read a large Excel spreadsheet (60,000 rows) but I get 
what appears to be a memory leak error from the JVM when I use the 
*XSSFWorkbook *API.  I learned recently that there are size limitations 
on Excel files being read in this way and the Apache POI website 
specifically recommends reading the file in a streaming fashion instead 
of taking the whole file in memory.  To do this, POI recommends using 
something called *XLSX2CSV* but the provided link to teach how to use 
this returns a "page not found error."


That file has moved to
https://svn.apache.org/repos/asf/poi/trunk/poi-examples/src/main/java/org/apache/poi/examples/xssf/eventusermodel/XLSX2CSV.java

We have re-organised the svn area as part of our move to Gradle as our 
build system, and not all the links have been updated yet


Nick

-
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org



Re: Invalid CellReference:1A

2021-02-10 Thread Nick Burch

On Wed, 10 Feb 2021, Ash B wrote:

While reading the file in java  , getting below issue .

java.lang.IllegalArgumentException: Invalid CellReference:1A
at org.apache.poi.ss.util.CellReference.seperateRefParts


A cell reference of 1A is invalid - it should be A1.

Where is this problematic file coming from?

If you rename the .xlsx file to .zip, unzip and check the sheet xml files, 
can you see the 1A reference there? If so, can you please share the 
xml snippet?


Nick

-
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org



Re: Upgrade to 5.0.0 resulted in IOException message

2021-02-07 Thread Nick Burch

On Fri, 5 Feb 2021, Bryan Coleman wrote:

That makes sense.  I am using an ant script to unjar individual jars and
then to create a mega jar from my classes and those APIs.  I am keeping the
services files.  That said, I noticed that both the main poi and the
poi-ooxml jar have services files with the same name; however, different
content.  What is the proper way to include those?  Should I create one
file that contains the content from both


When merging jars with service files, you need to append the files with 
the same name together - they contain one class file per line so you need 
all the lines


Nick

-
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org



Re: Upgrade to 5.0.0 resulted in IOException message

2021-02-05 Thread Nick Burch

On Wed, 3 Feb 2021, Bryan Coleman wrote:
java.io.IOException: Your InputStream was neither an OLE2 stream, nor an 
OOXML stream or you haven't provide the poi-ooxml*.jar in the 
classpath/modulepath - FileMagic: OLE2, having providers: 
[org.apache.poi.xssf.usermodel.XSSFWorkbookFactory@40086342]


That's very odd - you have the main POI jar there (need it for 
WorkbookFactory), but somehow have lost the reference to the HSSF classes 
which also live there...


Note: It works when running in my IDE; however, when it runs from a 
(mega) jar (built by ant), we get the exception above.


Ah, I suspect something is clobbering some files when it merges stuff into 
a mega jar. My guess is it's the /META-INF/services files, and your mega 
jar is taking just one rather than merging as it needs to


Can you post details of how you're building the mega jar, and/or look up 
how to get the tool/task you're using to merge not overwrite ServiceLoader 
/ Java Service Provider Interface services files?


Nick

-
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org



Re: Floating Point Arithmetics, reading 0.1066913 gives 0.10669129999999999 --> GNUMERIC vs LibreOffice vs Excel

2019-10-20 Thread Nick Burch

On Sun, 20 Oct 2019, David Law wrote:
the Cells have no Format.  Take a look at the attached File, cell F10. 
Andreas tells me it was entered as 0.1066913 & that's how its displayed 
too, although it has no format.


Numeric cells have a default format if nothing else is applied, it could 
be that perhaps?


I know that David North did some work a few years ago on trying to 
understand + match the Excel floating point rules, it might be worth 
having a look at some of his mailing list posts for more details. He isn't 
involved much in POI at the moment (day job priority changes), but we can 
always ping him to chime in if needed!


Nick

-
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org

Re: Floating Point Arithmetics, reading 0.1066913 gives 0.10669129999999999 --> GNUMERIC vs LibreOffice vs Excel

2019-10-20 Thread Nick Burch

On Sun, 20 Oct 2019, Andreas Reichel wrote:

Opening the file in an XML Text Editor, I get 0.106691299.
Opening the file in GNUMERIC, I get 0.106691299. (Both the
shown cell content as well as the editable text box).
Opening the file in LibreOffice, I get 0.1066913. (Same file, I have
tried 3 times. Both the shown cell content as well as the editable text
box).
Opening the file in MS Excel, I get 0.1066913.

So, while it is still an Excel problem, I wonder how/why LibreOffice
and Excel know/decide about interpreting the value as  0.1066913?!


If you ask POI to format the value to a string based on the formatting 
rules applied to the cell, using DataFormatter or similar, do you get the 
value you expect?


Nick

-
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org



Re: Hyperlink() special case

2018-11-29 Thread Nick Burch

On Thu, 29 Nov 2018, Greg Woolsey wrote:

The first argument to the function is the target URL, the second argument
is optional display text.

The current implementation returns the display text, if present, as the
function value.  This is correct, as that's the value displayed, but if you
want to actually implement a link, one needs the URL argument also.


I'd lean towards one of:
 * ThreadLocal that the hyperlink function checks (with helpful setter and
   clearer on it) to toggle between returning the text or link
 * setter on FormulaEvaluator that lets you toggle, then pass that
   somehow to the hyperlink function
 * subclass of FormulaEvaluator with extra evaluateCell method that
   returns the link, plus similar to above way to pass flag to function

Nick

-
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org



Re: CLI for POIFS

2018-11-21 Thread Nick Burch

On Wed, 21 Nov 2018, Jarl Friis wrote:

As explained on http://poi.apache.org/components/poifs/index.html
POIFS is a library to handle POIFS files just like a zip-container.

Is there a command line tool to just unpack any OLE2, i.e. POIFS file?


org.apache.poi.poifs.dev.POIFSDump may do what you want

Nick

-
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org



Re: Disabling Logging in POI 3.15

2018-11-19 Thread Nick Burch

On Mon, 19 Nov 2018, Sawan.Patwari wrote:

There were around 91 matches to the 'System.out.println' statements in the
POI-3.15 source-code


Apache POI 3.15 is now over 2 years old, and several security issues have 
been fixed in newer versions. What happens when you upgrade?


Nick

-
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org



Re: POI 4.0.0 issues with new commons-compress library "InputStream of class [..] is not implementing InputStreamStatistics"

2018-09-29 Thread Nick Burch

On Sat, 29 Sep 2018, Jörn Franke wrote:
as part of the HadoopOffice library ( 
https://github.com/zuinnote/hadoopoffice/wiki) we provide the 
functionality to read office documents, such as MS Excel, on Big Data 
platforms, such as Hadoop/Hive/Spark/Flink.


We should probably list that on the website! Do you have a few paragraph 
blurb we can use?



I want to release a new version supporting POI 4.0.0, but I have one
remaining blocking issue: The Big Data platforms use an old version of
commons-compress (between 1.4.x and 1.9.x). This means I am always running
into the exception in ZipArchiveThresholdInputStream "InputStream of class
[..] is not implementing InputStreamStatistics" (
https://svn.apache.org/viewvc/poi/trunk/src/ooxml/java/org/apache/poi/openxml4j/util/ZipArchiveThresholdInputStream.java?view=markup=1832789
).


We need that for security reasons - newer Java versions won't let us 
protect against zip bomb attacks as they inconveniently hide the expansion 
stats, so we had to switch to commons to guard against it.



Unfortunately, updating these platforms to the latest commons-compress is
very intrusive and for many organizations not possible.


Wave some CVEs at them and see if you can tempt an upgrade?

If not, you'd probably need to work with the commons folks to backport the 
zip stats stuff to your old version, so you can keep the security stuff we 
need? dev@commons is moderately quiet and fairly friendly :)


Nick

-
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org

Re: Create XSSFWorkbook

2018-09-07 Thread Nick Burch

On Fri, 7 Sep 2018, Tony Obermeit wrote:

Thanks.  Much progress, now it's failing on:
org.apache.xmlbeans.XmlException. I think that's part of
ooxml-schemas-1.3.jar, trying to find a downloadable jar.


Nope, that's in xmlbeans itself. If you download the binary release of 
Apache POI, it's in the ooxml lib directory. I'd recommend using the 
latest xmlbeans 3.0.1 jar though from https://xmlbeans.apache.org/ as 
there's a few key fixes since 2.6!


Nick

-
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org



Re: Create XSSFWorkbook

2018-09-07 Thread Nick Burch

On Fri, 7 Sep 2018, t...@tamborine.to wrote:

This document and others gives a snippet: new XSSFWorkbook()

or
workbook = new XSSFWorkbook()

This fails: unable to resolve class XSSFWorkbook


See http://poi.apache.org/components/index.html#components - XSSF (and the 
other OOXML classes) require additional jars


Nick

-
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org



Re: Workbook saved using POI 3.16 needs to be repaired when opening in Excel 2016

2018-06-23 Thread Nick Burch

On Fri, 22 Jun 2018, Quinton McCombs wrote:
The nightly build does not have this problem although 3.17 does. Is 
there a workaround until 4.0 is released?


Keep using the nightly build?

(I'd actually recommend a nightly build from about a week ago, there's a 
bunch of changes happening right now with xmlbeans in preparation for 4.0 
so something just before that may be more stable, though anything that 
passes all the unit tests *ought* to be just fine!)


Nick

-
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org



Re: Workbook saved using POI 3.16 needs to be repaired when opening in Excel 2016

2018-06-22 Thread Nick Burch

On Fri, 22 Jun 2018, Quinton McCombs wrote:
I have run into a problem after upgrading from POI 3.14 to POI 3.16. 
With POI 3.14, I can load a workbook, save it, and then open in Excel 
2016 without a problem. However, after upgrading to POI 3.16, the 
workbook needs to be repaired after opening in Excel. The repaired 
version of the workbook loses some formatting and all named ranges are 
gone.


Can you try a recent nightly build, and see if we've already fixed it 
since 3.16?


Nick

-
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org



Re: Application fails with Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/poi/xwpf/usermodel/XWPFDocument

2018-02-11 Thread Nick Burch

On Sat, 10 Feb 2018, Marco Lechner - FOSSGIS e.V. wrote:

Exception in thread "main" java.lang.NoClassDefFoundError:
org/apache/poi/xwpf/usermodel/XWPFDocument
    at de.bfs.dokpool.faq.importer.FaqImporter.main(FaqImporter.java:23)
Caused by: java.lang.ClassNotFoundException:
org.apache.poi.xwpf.usermodel.XWPFDocument

The as far as I know relevant dependency packages seem to be included in
the jar-file. Tried with unzip -l:

$ unzip -l target/dokpool-faq-importer.jar
Archive:  target/dokpool-faq-importer.jar
  Length  Date    Time    Name
-  -- -   

*snip*

   190432  2018-02-09 22:18   lib/gson-2.2.4.jar
   751238  2018-02-09 22:18   lib/commons-collections4-4.1.jar
    26514  2018-02-09 22:18   lib/stax-api-1.0.1.jar
  1433719  2018-02-09 22:18   lib/poi-ooxml-3.16.jar


Generally classloaders will only load classes from within a jar, not 
jars-within-jars.


If you want to only have a single jar with everything in, you'll either 
need to switch to a more "war-like" classloader to have jars within your 
jar loaded, or use something like shading/shadowing when you build the jar 
to have the classes of the dependencies in-lined within your jar


Nick

-
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org

Re: upgrade from 3.16 to 3.17 fails with cannot access org.apache.poi.wp.usermodel.Paragraph

2018-02-11 Thread Nick Burch

On Sun, 11 Feb 2018, Marco Lechner - FOSSGIS e.V. wrote:

when I try to upgrade to 3.17 from 3.16, I get the following error:

cannot access org.apache.poi.wp.usermodel.Paragraph
[ERROR] class file for org.apache.poi.wp.usermodel.Paragraph not found


This class is contained within the main Apache POI jar. You need both the 
OOXML jar and the main jar, eg poi-3.17.jar and poi-ooxml-3.17.jar, along 
with their dependencies


Nick

-
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org



Re: POI snapshot jars

2017-12-18 Thread Nick Burch

On Fri, 15 Dec 2017, pj.fanning wrote:
Do we publish poi snapshot jars to a maven repo? I checked 
http://repository.apache.org/snapshots/ but didn't find poi jars.


On the whole, the ASF isn't a huge fan of letting "general end users" work 
with nightly builds/snapshots. Generally, there's a desire that end users 
should be able to get recent-ish official releases with their fixes in 
from the project. See 
http://www.apache.org/legal/release-policy.html#publication


We do make available nightly builds to help developers / contributors with 
testing fixes and new contributions, details near the bottom of 
http://poi.apache.org/download.html


Nick

-
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org



Re: getCellType and getCellTypeEnum

2017-11-14 Thread Nick Burch

On Tue, 14 Nov 2017, polatalemdar wrote:

http://poi.apache.org/apidocs/

Can you show me what is the description of the deprecation and what
alternative we can use in this documentation?


The website always shows the latest javadocs. As such, it shows the new 
Enum-returning version as not deprecated, as the change has happened



I am using 3.15


You need to check the javadocs for 3.15 - they come with the binary 
download, or can be feteched from Maven


Nick

-
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org



Re: getCellType and getCellTypeEnum

2017-11-14 Thread Nick Burch

On Tue, 14 Nov 2017, polatalemdar wrote:

Can you explain what should we use if both of them are deprecated?


Did you read the full deprecation notes in the javadocs for the version 
of Apache POI you are using? That explains what is happening, and what to 
do in the mean time


Alternately, grab a nightly build of Apache POI, and you can use the 
post-breakage Enum-returning getCellType with no warnings!



BY the way: CONGRATULATIONS TO THE specific developer who made both of 
the functions deprecated. I believe no one is using this library lively. 
I will stop using it for ever once I fix that.


Blame the people who wrote the Java language specification - because of 
how Java treats changes to return types on methods, and the limited 
deprecation + change semantics, it's really the only way to warn people 
that the change is coming


Nick

-
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org



Re: Conditional Formatting issue

2017-10-19 Thread Nick Burch

On Wed, 18 Oct 2017, Blake Watson wrote:
I've attached a greatly reduced version of the spreadsheet. About 15 
cells with one conditional. I've tried to reduce it further but can't 
seem to do it without altering the error. Actually, that might be 
important. Most of my tweaks, if they don't fix the problem, result in:


NullPointerException
org.apache.poi.ss.formula.ConditionalFormattingEvaluator.getRef
(ConditionalFormattingEvaluator.java:210)


Could you open a bug in bugzilla, upload the file, and a snippet of code 
needed to reproduce the error? It's much less likely to get lost / 
forgotten on bugzilla than email!


Nick

-
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org



Re: Conditional Formatting issue

2017-10-18 Thread Nick Burch

On Wed, 18 Oct 2017, Blake Watson wrote:
Related: I downloaded Eclipse and POI to build a test case, but I'm kind 
of at a loss. I haven't been able to run the tests from Eclipse.


The steps ought to be:
 * Ensure you're on a version of Eclipse that supports Java 8
 * Checkout from svn / git
 * On the command line, do "ant compile" to have dependencies fetched
 * In Eclipse, do Import -> General -> Existing Project into Workspace
 * Point it at your checkout
 * Wait for the build to finish
 * Right click on a unit test and do Run As -> JUnit Test

Nick

-
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org



Re: Conditional Formatting issue

2017-10-11 Thread Nick Burch

On Tue, 10 Oct 2017, Blake Watson wrote:

I'm trying to create a simplest example of this but I have a situation
where I:

1. Load a workbook with a conditional.
2. Create a FormulaEvaluator for that workbook.
3. Create a ConditionalFormattingEvaluator for that workbook and evaluator.
4. Create a Cell for a cell in the workbook that has formatting.
5. Call getConditionalFormattingForCell for that CFE made in #3 and the
cell made in #4.
6. POI returns a "NullPointerException java.utilCalendar.setTime
(Calendar.java:1770)


Can you turn this into a junit unit test? If so, please upload it to 
bugzilla and we'll step through it with a debugger to see where the POI 
bug is! We'll also then have a unit test to confirm it's fixed + stays 
fixed in the future :)


Nick

-
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org



Re: Corrupted xlsx file on just one server

2017-09-07 Thread Nick Burch

On Thu, 7 Sep 2017, Seb Duggan wrote:

Notice the differences in the first two lines...

There is a minimal difference in the Java version on the servers: my local
computer is running 1.8.0_112, staging is running 1.8.0_45, and live
(producing the corrupt files) is on 1.8.0_66. All are Oracle 64-bit.

Can anyone think of what might be causing the difference in output between
the servers? I've been banging my head against this now for a couple of
days!


My best guess is that on production, you have deliberately or accidentally 
configured in a different (likely older / less functional) XML library as 
your JVM-default XML handler


I'd suggest doing something like "SAXParserFactory.newInstance();" then 
checking what class you got + which jar that class came from. My hunch is 
they won't be the same on both servers, and likely won't be what you 
wanted / expected on your broken server!


Nick

-
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org



Re: Fwd: Clarification on Actual Date from Excel

2017-06-08 Thread Nick Burch

On Thu, 8 Jun 2017, sridhar vr wrote:

We are working on a requirement. where in excel the value is 05/06/2017,
but while retrieving the values comes as 05-Jun-2017.


That looks correct to me! It also would look correct to most of the world 
too... See the handy map on 
https://en.wikipedia.org/wiki/Date_format_by_country



The format is not standard(05/06/2017) for us it may come in any
format(dynamic), so I cant use dateformatter as well.

But I want to get the actual date as 05/06/2017 from the system.


Simplest would be to check if the cell is a date cell, then fetch the cell 
value as a Java date, and finally format that in Java however you really 
wanted the date format to be


You'll want
 * DateUtil.isCellDateFormatted(cell)
 * cell.getDateCellValue()

Nick

-
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org



Re: IS it safe to write to multiple sheets from threads?

2017-05-14 Thread Nick Burch

On Sat, 13 May 2017, yevsh wrote:
Is it safe to write to multiple sheets that belong to same workbook, 
from multiple threads?


No

It is safe for multiple threads to write to their own workbooks. However, 
all operations on a single workbook must be done from the same thread


Nick

-
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org



Re: Java API for Excel

2017-04-22 Thread Nick Burch
Apache Commons CSV  
talks the Excel CSV variant (amongst others - it's configurable)


Nick

On 21/04/17 22:48, Javen O'Neal wrote:

I'm guessing that an Excel CSV file is a regular CSV file that is written
in a specific dialect (quoted strings, newline character, delimiter, etc).

You could try transforming your CSV file into an Excel dialect if you knew
what this dialect was (I'm assuming the receiving website has a spec and a
file validator, either of which could be used to figure out why vanilla CSV
files are rejected).

You could try using Python's csv module to write your data in the Excel
dialect, which might be easier than a Java project.

You could also work with the website authors for a better way to exchange
data. If you need some teeth with your request, mention how their system
might have issues demonstrating compliance with HIPAA.


On Apr 21, 2017 11:04 AM, "Dominik Stadler"  wrote:

Hi,

Can you share an anonymized sample-csv-file? Apache POI likely does not
have support for this itself, but if it is a text-format, it should be
fairly easy to iterate over the rows/cells and produce the text-format
yourself if we can figure out how Excel formats this CSV in some special
way to be able to later detect the format again.

BTW, please post such questions to the user-list, not to developers
directly, this prevents others from participating in the documentation.

Dominik.

On Fri, Apr 21, 2017 at 7:45 PM, Siva Sripada 
wrote:


Hi,
My name is Siva Sripada and I am a physician living in Michigan. I am
writing a program to upload a list of patients (first name, last name,
birthdate) to a website that will provide me information on a patient’s
previous prescription history over the last 1 year.
I am using the Apache API for Java for Excel and the program works
great!!! but….The website requires that the file be in *Excel’s* CSV

format

only. So for example the website will not take a text file with
comma delimiters saved as a CSV file. Nor will it accept a .xlsx that was
simply renamed with a .cvs file. Currently I am opening the .xlsx file
saved with my program thru Excel and re-saving it with Excel’s CSV format
- which works but does not leave me satisfied.

Excel’s CSV files have the unique(?) property of being able to be opened
by both Excel and any text editor (don’t know if that means anything to
you). Plus these files when opened in a text editor have comma delimiters
plus a carriage return at the end of each line.

My question is :
Is there a way to change the formatting of the .xlsx file thru The Apache
API so that it saves it as a “true” Excel CVS file?

Any help would be appreciated.
Thank you,
Siva Sripada






-
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org



Re: Evaluating Arbitrary Formula

2017-04-19 Thread Nick Burch

On Wed, 19 Apr 2017, Greg Woolsey wrote:
Missed the 2nd half of the question.  This class only returns rules that 
match the current state of the workbook for a given cell - rules that 
would be applied were it open in Excel.  Note that this logic is limited 
to evaluating functions actually implemented in POI, which is most of 
them, but there are some exceptions and a few open bugs.


Is it worth adding this as another demo method to 
src/examples/src/org/apache/poi/ss/examples/ConditionalFormats.java ?


Nick

-
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org



Re: How to identify date cell type using XMLStreamReader (poi's XSSFReader api)

2017-03-29 Thread Nick Burch

On Wed, 29 Mar 2017, kakadi wrote:
Can you please give me an example on how to get formatIndex and 
formatString for my above example as isADateFormat(int formatIndex, 
java.lang.String formatString) expects both the parameters


Take a look at XSSFSheetXMLHandler - that shows how to get the format 
index and string, then use it when a numeric cell closes


Nick

-
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org



Re: Apache POI - Detecting difference between an xlsx file and a normal zip file

2017-03-22 Thread Nick Burch

On Wed, 22 Mar 2017, Thiyagarajan wrote:

I have an InputStream wrapped in a BufferedInputStream and I'm trying to
detect whether it is a normal zip file or a xlsx file (and take appropriate
actions accordingly). I have tried to use  hasOOXMLHeader

to achieve this. But it just checks if the input stream is a zip file and
there is nothing specific for an xlsx file there. (I understand that xlsx is
a zip file with a bunch of xml files).

Is it possible to detect if the inputstream is from xlsx file or a normal
zip file?


Not easily from an InputStream. You'd need to check if there's a 
[Content_Types].xml file in the zip to have a good idea, and that may not 
be the first file in the zip. So, you'll need to buffer the zip stream 
into memory, and parse through the entries to see if there's a content 
types, and rewind back to process if so. Much easier with a File, as you 
can easily do random-access to check without buffering


Nick

-
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org



Re: xssf writing large numbers

2017-01-30 Thread Nick Burch

On Tue, 31 Jan 2017, Cem Dayanik (Ibtech-Software Infrastructure) wrote:
While I searched quite a lot in net, I havent able to find an exact 
solution.


I havent able to write a number with 16 digits.

Ex) 3340005973272861


That's an excel limitation, or rather a limitation in the various Excel 
file formats. See 
https://support.microsoft.com/en-in/help/269370/last-digits-are-changed-to-zeroes-when-you-type-long-numbers-in-cells-of-excel



Is there a solution other than writing it as a string?


As per the above Microsoft support article, if you want to keep using the 
XLS or XLSX file formats, a string is your only option. Other file formats 
avoid this limitation, though if Excel reads those files back in it'll 
often do the same truncating on them when loaded...


Nick

-
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org



Re: How do you code cell striping?

2016-12-12 Thread Nick Burch

On Mon, 12 Dec 2016, Eric Douglas wrote:
I found one sample that shows how to code the condition using 
org.apache.poi.ss.usermodel.SheetConditionalFormatting.addConditionalFormatting() 
to put in the formula that would color each cell if it's in an even 
numbered row, but I'm having trouble figuring out the API to apply the 
formula to every cell on the worksheet.


For every cell on a sheet, just give a cellrangeaddress that covers the 
whole extent


For every formula cell, you'd need to loop over all cells checking the 
cell type, then add just those


Nick

-
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org



Re: poifs: Orphan VBA directory entry when removed

2016-12-06 Thread Nick Burch

On Mon, 5 Dec 2016, Maxime GUERREIRO wrote:

I'm trying to remove a directory structure from a Composite Document
File V2 Document file,
and my code is working in every cases I could test except one.

Here it is: https://gist.github.com/mguerreiro/7023cab737917e936ac40857cdc12035
sampleCall is the call I'm using.

My issue is that when trying to nuke "Macros", I have a stale
directory entry named "VBA".
I can't find it using Apache POI's API: it doesn't seem to detect
orphans (why would it, after all?), so I can't remove them manually
either.


I've tried writing a unit test, based on your code, but I can't reproduce 
the issue. See RecursiveDelete() of TestNPOIFSFileSystem


The only thing I will say - NPOIFSFileSystem will only ever extend files, 
never shrink them. It doesn't zero-out removed bits either, just zaps the 
references.


We've talked about adding a "defrag" mode to it, but not for a while. The 
only way to fully get the space back from deleted entries / un-used blocks 
is to copy (EntryUtils helps) from the open NPOIFS into a brand new one, 
then save that. If you want to fully zap something nasty, you probably 
want to take that route.


Nick

-
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org



Re: Dependencies of APACHE POI

2016-11-18 Thread Nick Burch

On Fri, 18 Nov 2016, Sateesh K Kolusu wrote:
Hi - How/Where do we exactly know the versions of dependency components 
for APACHE POI 3.14 ?


For human-readable, see http://poi.apache.org/overview.html#components

For machine-readable, see the maven poms for the components you use, eg 
https://mvnrepository.com/artifact/org.apache.poi/poi-ooxml/3.14


Nick

-
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org



Re: Restarting styled numbered/lettered lists

2016-11-03 Thread Nick Burch

On Thu, 3 Nov 2016, Jim Klo wrote:
As I toiled away trying to figure out how to restart styled 
bulleted/numbered/lettered lists without messing with the applied 
numbering style - as there seems to be a lack of reasonable examples (in 
both the unit tests, and existing documentation and tutorials)


Where would you have expected to find a method to do this in XWPF, and 
what would you have expected it to look like?


I finally found a solution - and I kind of understand it… So I published 
a full working sample here: 
https://github.com/jimklo/apache-poi-sample/blob/master/src/main/java/com/sri/jklo/StyledDocument.java


Based on the investigations you've done shown in that code, we might be 
able to add a friendly method, especially if you can suggest what that 
method should look like... :)


Nick

-
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org

Re: POI-3.15: Commons-Collections-4 vs. Jasper Reports' dependency on Commons-Collections-3

2016-10-26 Thread Nick Burch

On Wed, 26 Oct 2016, Andreas Reichel wrote:
Sine POI 3.15 the software depends on commons-collections-4 (previously 
it depended on commons-collections-3 only). There is however another 
good software library "Jasper Reports", which still depends on 
commons-collections-3 and also Apache POI.


When using commons-collections-4 and POI-3.15, Jasper Reports throws:
Exception in thread "AWT-EventQueue-0" java.lang.NoClassDefFoundError: 
org/apache/commons/collections/map/ReferenceMap
at 
net.sf.jasperreports.engine.component.ComponentsEnvironment.(ComponentsEnvironment.java:57)


You'd probably be best off asking the Apache Commons Collections folks 
[1], and/or get Jasper to upgrade to a newer Commons Collections


Nick

[1] http://commons.apache.org/proper/commons-collections/mail-lists.html

-
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org



Re: Linking External Workbooks

2016-10-06 Thread Nick Burch

On Thu, 6 Oct 2016, Blake Watson wrote:

That's just how Excel stores it for XLSX files. The link table provides
the mapping between those indexes and the names shown in Excel. When POI
hits one of those, it goes to the link table to find the name of the file,
then checks for a setup referenced workbook with that name to resolve
things with


​So, when POI sees the [1] how does it know that that should map to
"myspreadsheet.xlsx"? Because the setupReferenced​Workbooks map is
"name"->evaluator. I feel like not knowing this is the key to my confusion.


The name to ID mapping is stored in the ExternalLinksTable, which comes 
from another xml file within the .xlsx bundle. That's what POI uses to 
lookup the name, and what Excel transparently uses when displaying the 
formula


Nick

-
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org

Re: Linking External Workbooks

2016-10-06 Thread Nick Burch

On Wed, 5 Oct 2016, Blake Watson wrote:

I've also tried putting in "[1]" or "1" in the map rather than my
spreadsheet name. I don't see in all this how the spreadsheet name in Excel
comes out as "[1]" in POI.


That's just how Excel stores it for XLSX files. The link table provides 
the mapping between those indexes and the names shown in Excel. When POI 
hits one of those, it goes to the link table to find the name of the file, 
then checks for a setup referenced workbook with that name to resolve 
things with


Take a look at TestXSSFFormulaEvaluation method 
testReferencesToOtherWorkbooks() + referenced test files for how it ought 
to work



7. Vall evaluatorFormulaCell on the cell checked in step 3 and get:
IllegalArgumentException Invalid sheetIndex: -1.
org.apache.poi.ss.formula.SheetRefEvaluator.
(SheetRefEvaluator.java:36)


That doesn't look like the normal error for missing linked workbooks, so 
there might be a bug


Can you create two very simple workbooks, both with 2 sheets, one with a 
handful of data cells in, the other with your formula, which shows the 
problem? If so, please upload them to bugzilla along with a small junit 
unit test, and we'll take a look


Nick

-
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org



Re: BigGridDemo for a RichText

2016-10-03 Thread Nick Burch

On Sun, 2 Oct 2016, Ratna wrote:

Could you please specify the XML format for each cell,as mine is not working


Don't use BigGridDemo any more! You should use SXSSF instead

Nick

-
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org



RE: Formulas don't throw exceptions but show up "#NAME"

2016-09-13 Thread Nick Burch

On Tue, 13 Sep 2016, Justin Flowers wrote:
OK, it was a sanity issue on my part. Apparently you can get this issue 
if a UDF is not defined. And of course I wrote a Java UDF here that it 
cannot find on its side. The functions I'm implementing already exist in 
Excel, just not yet in POI ("STDEV.P" and "T.TEST").


Ah. You can't generally use UDFs to implement functions that POI lacks, 
generally only for brand new custom functions. See 
http://poi.apache.org/spreadsheet/eval-devguide.html for a bit more on 
implementing a "missing" function, and also a talk from an old ApacheCon:

http://home.apache.org/~yegor/apachecon_us2010/Evaluation_Of_Excel_Formulas_In_POI.pptx

We'd love a patch if you can implement the missing functions! See 
http://poi.apache.org/guidelines.html


Nick

-
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org



Re: Including poi-ooxml-schemas

2016-08-12 Thread Nick Burch

On Thu, 11 Aug 2016, Branden Visser wrote:

So it begs the question, what is the intended usage of this
"poi-ooxml-schemas" dependency?


http://poi.apache.org/faq.html#faq-N10025

Nick

-
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org



Re: Reading xlsx file using poi3.7

2016-06-14 Thread Nick Burch

On Tue, 14 Jun 2016, pavanikureti wrote:
I will change the poi jars with the suggested version. I am using few 
more jars also as below. javax.xml.stream-1.0.1, xmlbeans-2.3.0.jar, 
dom4j.jar, ooxml-schemas-1.0.jar Are these fine?


See http://poi.apache.org/overview.html#components for the details of what 
dependent jars are required by the various POI components. (Some of your 
jars aren't needed, some need upgrading, the components page will tell you 
what)


Or switch to something like Gradle or Maven, which takes care of all of 
this sort of stuff for you!


Nick

-
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org



Re: Reading xlsx file using poi3.7

2016-06-14 Thread Nick Burch

On Tue, 14 Jun 2016, pavanikureti wrote:

I am using following jars
javax.xml.stream-1.0.1,
poi-3.8-final-jdk1.4-20120520-rc1,
poi-3.8-final-jdk1.4-ooxml-20120520-rc1,
xmlbeans-2.3.0.jar,
dom4j.jar,
ooxml-schemas-1.0.jar

Can you please let me know am I using the correct jars?


Nope, you should really be using at least 3.14, if not 3.15-beta1

You can get the new jars through Maven, through Gradle, or from our 
download site: http://poi.apache.org/download.html


Nick

-
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org



Re: CTLongHexNumber

2016-06-02 Thread Nick Burch

On Thu, 2 Jun 2016, Murphy, Mark wrote:
Is there a way to use CTLongHexNumber without it causing a type cannot 
be resolved error?


The error is: The type 
org.openxmlformats.schemas.wordprocessingml.x2006.main.CTLongHexNumber 
cannot be resolved. It is indirectly referenced from required .class 
files.


Are you using the full ooxml-schemas file, or the small poi-ooxml one? (If 
so, see the faq for the fix!)


Nick

-
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org



Re: How to write normal and subscript text in one excel cell?

2016-06-02 Thread Nick Burch

On Thu, 2 Jun 2016, janek.schroeder wrote:

I want to insert normal text and subscript/superscript (eg. CO2) in one
excel cell.


You want to use RichTextString functionality. See the docs for more:
http://poi.apache.org/spreadsheet/quick-guide.html#RichText
And also some examples, eg
https://svn.apache.org/repos/asf/poi/trunk/src/examples/src/org/apache/poi/xssf/usermodel/examples/WorkingWithRichText.java

Nick

-
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org



Re: extracting hyperlinks from xlsx with XSSFEventBasedExcelExtractor?

2016-05-12 Thread Nick Burch

On Thu, 12 May 2016, Allison, Timothy B. wrote:
 On TIKA-1454, one of our users asked to add extraction of hyperlinks 
from xlsx.  It looks like hyperlink info appears at the end of the 
sheet.xml (after the  in the  section).  Are 
there any recommendations for merging hyperlink info with the actual 
cells via XSSFEventBasedExcelExtractor?  Double-pass on the sheet.xml or 
just give up and dump the hyperlinks at the end of the sheet... or other 
options?


The only option I can think of that'd work would be an optional flag which 
would trigger a second SAX processing with a different handler. That would 
capture the (hopefully!) small set of hyperlink data, which the "normal" 
second process could then use


Nick

-
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org



Re: not implemented subtotal functions (101-111)

2016-04-26 Thread Nick Burch

On Tue, 26 Apr 2016, Jean Rossier wrote:
I'm wondering if there is any roadmap to implement the subtotal functions 
with the option "ignore hidden values" (function code 101-111) ?

https://poi.apache.org/apidocs/org/apache/poi/ss/formula/functions/Subtotal.html


We'd love a contribution for this! Please join the dev list if you'd like 
some pointers on adding in this support


Nick

-
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org



Re: POI not parsing these XLS file

2016-04-21 Thread Nick Burch

On Thu, 21 Apr 2016, Andrew Munn wrote:

I can not get POI to parse these XLS files being generated by Bloomberg
into something useful.  I can use Tika to parse them into one long
line of HTML but then there are no cell breaks or line breaks.

http://www.topazdevelopment.com/tmp/poi/2010-cal-eu.xls


This is not an Excel .xls or .xlsx file. Instead, it's one of those 
strange Microsoft "Excel" xml files, which just uses a single xml file to 
hold a stripped-down version of the spreadsheet


Apache POI only supports .xls and .xlsx files, you'll need to write your 
own XML parsing code to handle these XML files


Nick

-
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org



RE: New POI problem

2016-04-15 Thread Nick Burch

On Fri, 15 Apr 2016, Thaddaeus Fillmore - US wrote:
Thanks for the reply!  I actually got it to work using ExtractorFactory 
though.  (I had a typo in the path to the jar files).  Is Tika just for 
Office documents or can it also read other formats?  Ideally I'd like 
something that could process plain text, Word documents, pdfs, and 
images, but as of right now I'm able to handle all of those formats 
using a variety of means.


Apache Tika can probably get text out of your kitchen sink! Especially if 
it's panamanian... ;-)


Nick

Current formats = http://tika.apache.org/1.12/formats.html
Tika's use on panama papers = 
https://source.opennews.org/en-US/articles/people-and-tech-behind-panama-papers/

-
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org



Re: New POI problem

2016-04-15 Thread Nick Burch

On Fri, 15 Apr 2016, Thaddaeus Fillmore - US wrote:
Gah, I'm back.  Ok, so now I'm trying to extract the text from a word 
document being uploaded to the server.  (This is all in coldfusion).  I 
first write a temp copy of the file to the disk.  I have verified the 
file writes successfully and can be opened in word.  Then, I try to use 
POI to read the file, but now I'm getting an exception when I try to use 
ExtractorFactory to createExtractor for the file.


I wouldn't recommend ExtractorFactory for new installations / new uses. 
You'd be much better off using Apache Tika instead for text extraction. 
Apache Tika builds on top of POI, amongst many others, and is where the 
bulk of the text extraction work happens these days. Tika can give you 
plain text, or metadata, or html, and generally does a lot more than the 
(now rather old) POI simple text extractors offer


You can also use the Tika Server or Tika CLI, which may be simpler for 
integration than trying to get the right jars and right class invocations 
in a framework like ColdFusion


Nick

-
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org



Re: Need Help with XWPFDocument issue

2016-04-13 Thread Nick Burch

On Wed, 13 Apr 2016, Thaddaeus Fillmore - US wrote:
Hello, I'm venturing into using POI to extract text from Word documents. 
For some reason when XWPFDocument initializes from my FileInputStream, 
there's and exception generated from java.util.zip.InflaterInputStream.


http://poi.apache.org/spreadsheet/quick-guide.html#FileInputStream
That's for Excel files, but much the same thing applies for XWPF and XSLF 
too


So, try opening from a File directly instead, don't use a stream if you 
don't have to!


Nick

-
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org



Re: Digging into Conditionals....

2016-03-30 Thread Nick Burch

On Tue, 29 Mar 2016, Blake Watson wrote:

I did a bit of stuff on it, enough to solve a $DAYJOB need, then stopped
again. Creation and Reading should both cover most things you want.
Modifying ought to have most things, but might need more work, especially
as it probably doesn't check enough to ensure you haven't put it in a bad
state...


​Fortunately, we're not doing any modifying. Just trying to get to the
point where we can​ ask "What color should this cell be?"


You might be best off trying harder than I did to get someone at Vaardin 
to contribute some/most/all of their conditional format evaluation stuff 
back to POI then



Otherwise, look at how the formatter evaluation stuff we already have for 
data format strings like [red]#;[green]# works, and plan on expanding that 
with help from things like FormulaParser and FormulaEvaluator


Nick

-
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org

Re: Digging into Conditionals....

2016-03-24 Thread Nick Burch

On Thu, 24 Mar 2016, Blake Watson wrote:

We're trying to figure out how to apply conditionals to our spreadsheets,
and have noticed that it looks like some work is being done in this area


I did a bit of stuff on it, enough to solve a $DAYJOB need, then stopped 
again. Creation and Reading should both cover most things you want. 
Modifying ought to have most things, but might need more work, especially 
as it probably doesn't check enough to ensure you haven't put it in a bad 
state...


Evaluation is a big missing piece, at least in POI. The folks at Vaardin 
have done some work on it, but I can't get any of them to reply to my 
emails about merging it back in :(


currently (as in new to 3.14). There's a new CTColorScaleImpl class, and 
it contains these baffling lines:


import 
org.openxmlformats.schemas.spreadsheetml.x2006.main.impl.CTColorScaleImpl.1CfvoList;
import 
org.openxmlformats.schemas.spreadsheetml.x2006.main.impl.CTColorScaleImpl.1ColorList;


Those will have been there for a long time in the full schemas jar, 
they're auto-compiled from the official file format spec XSDs. However, 
the work I did on XSSF CFs, and especially the unit tests, have pulled 
them into the smaller schemas jar. You probably want to use the full 
ooxml-schemas jar when doing new stuff, until we get the unit tests to 
drag them over


Otherwise, build a spreadsheet like you want in excel, save, unzip the 
.xlsx and read the xml to see what's needed!


Nick

-
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org



Re: Reading .xlsm file with more than 1 millions of rows

2016-03-24 Thread Nick Burch

On Wed, 23 Mar 2016, arashmoeenj wrote:
Thank you for your reply. I'm pretty new to Apache POI and the concept 
of excel parsing, so is there any example for 'mid-level SAX-based 
SheetContentsHandler' that I can refer to ?


Umm, the one from my previous email?

Nick

-
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org



Re: Reading .xlsm file with more than 1 millions of rows

2016-03-23 Thread Nick Burch

On Wed, 23 Mar 2016, arashmoeenj wrote:
In my project I have to parse a xlsm file which contains more than 1 
million rows (separated in multiple sheets each containing 1 million 
rows at maximum), and do some validations on the cells.


I've tried so many examples provided by Apache POI and StackOverFlow but
still couldn't manage to find a proper way to parse starting the first
row/cell to the last one.


For a file that big, you likely want to be doing either low-level SAX 
parsing, or more likely use the mid-level SAX-based SheetContentsHandler 
interface



P.S: I need the blank cells to be also included in the parsing section.


https://svn.apache.org/repos/asf/poi/trunk/src/examples/src/org/apache/poi/xssf/eventusermodel/XLSX2CSV.java
shows how to handle missing cells and missing rows when using the 
mid-level SheetContentsHandler with the SAX stuff


Nick

-
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org



Re: Reading (not writing) conditional states: possible?

2016-02-24 Thread Nick Burch

On Wed, 24 Feb 2016, Blake Watson wrote:

Thanks for the response, Nick!

I guess we'll have to roll our own...


And contribute it back? ;-)

Nick

-
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org



Re: Reading (not writing) conditional states: possible?

2016-02-24 Thread Nick Burch

On Wed, 24 Feb 2016, Blake Watson wrote:
So, there's lots of examples on how to write out conditional stuff to 
Excel files.


If you look at the unit tests, you'll see quite a few examples of reading 
conditional formatting too



But I've got an Excel workbook with some conditional cells in it already,
and I want to know how to format those cells.


Ah. That's not reading, that's evaluating! The state isn't written in the 
file, unlike with formulas, the application just has to calculate it on 
the fly



Do I have to go through all the conditions that might apply and do it that
way?


Yup! Sadly. You'll need to take account of priorites too, in case several 
apply.


I think that Vaadin might have some code to do some of this. I've been 
unable to get in touch anyone from the project to ask about contributing 
it back though[1] :(


Ideally we would have code in POI that would evaluate it. We already have 
code that can handle colours in data format strings (eg 
"[red]#,###.##;[green]-#,###.##") and tell you what colour the cell would 
be from that. That code would likely need to be generalised then 
conditional formatting support added. There's a few TODOs in the code to 
give an idea of the starting point.


Nick

[1] I've tried emailing the person listed in the Javadocs as being the 
author of the vaadin conditional formatting stuff, but never got any sort 
of reply


-
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org



Re: When using SAXParser to parse xlsx file, blank cells are skipped

2016-02-22 Thread Nick Burch

On Mon, 22 Feb 2016, Lynn Li wrote:

When using SAXParser to parse xlsx file, blank cells are skipped.

Does anyone have code to not skip blank cells using SAXParser? Any help 
you can provide is appreciated.


You just need to keep track of the cell reference of the last cell seen, 
and spot gaps from that. It's fairly easy to do, a good example is in the 
XLSX to CSV example that uses the SAX parsing stuff to do it:

https://svn.apache.org/repos/asf/poi/trunk/src/examples/src/org/apache/poi/xssf/eventusermodel/XLSX2CSV.java

Nick

-
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org



Re: Ability to set solid color on DataBarFormatting

2016-02-04 Thread Nick Burch

On Wed, 3 Feb 2016, Maheshwar Jayaraman wrote:
Interesting. The spec from 2009 
(https://msdn.microsoft.com/en-us/library/hh656506(v=office.12).aspx) 
does have the attribute but I see POI is using 2006. Is there a reason 
its not updated?


Nick, Any pointers?


I've got a feeling that someone tried, and loads of stuff broke... Your 
best bet would be to check the mailing list archives, and see if you can 
find the threads discussing it, probably from something like 3-4 years ago


Nick

-
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org



Re: Ability to set solid color on DataBarFormatting

2016-01-27 Thread Nick Burch

On Wed, 27 Jan 2016, Maheshwar Jayaraman wrote:
I have one question. Which branch should I submit the patch on? Trunk or 
REL_3_11_branch?


Trunk - we're working up to 3.14 final at the moment!

Nick

-
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org



Re: Ability to set solid color on DataBarFormatting

2016-01-27 Thread Nick Burch

On Tue, 26 Jan 2016, Maheshwar Jayaraman wrote:
The default is gradient fill for a data bar. I wanted it to be a solid 
(I think it translates to a gradient=0 attribute). I spent enough time 
trying to set this via reflection that I think I can take a stab at 
adding a patch for this. I will read docs to see how to contribute and 
take it forward from there.


First thing would be to produce new versions of 
NewStyleConditionalFormattings.xls / NewStyleConditionalFormattings.xlsx

which add in examples of the formatting you want into additional columns

Next, work out how to read that difference (unzip the XLSX file & read the 
xml / use BiffViewer)


Next, work out a sensible way to expose that difference in the conditional 
formatting usermodel classes


Now, add that to the usermodel, and matching lower level code as needed

Then update the unit tests to check on read and write

Finally, update the example to also generate these

And then you're done :)

Nick

-
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org



Re: Change chart type programmatically?

2016-01-12 Thread Nick Burch

On Sat, 9 Jan 2016, Ken Hausam wrote:
I am using XSSFChart and associated classes to create a line chart using 
Apache POI. Works great! Thanks. My question is, is there an easy way to 
change the chart type programmatically from a line chart to a stacked 
bar chart? I looked quickly at the CTChart class and associated CT 
classes and noticed that the various chart types had their own class. 
This makes me think that it's not as easy as just flipping a chart type 
attribute somewhere, but figured it couldn't hurt to ask.


I haven't looked at the chart stuff recently, so I can't answer off the 
top of my head. What I'd suggest you do is firstly create a simple file in 
Excel, with one sheet, with a few data points, and one style of chart. 
Save that. Next, change the type, and save-as that. Next, unzip both .xlsx 
files (rename to .zip and unpack). Now, compare the xml, especially for 
sheets and charts, and see what differs. Post a summary of that, and we'll 
help if we can!


Nick

-
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org



Re: 3.12 org.apache.poi.ss.formula.FormulaParseException

2015-12-03 Thread Nick Burch

On Thu, 3 Dec 2015, Brian Milnes wrote:

As I don't know how to correctly elide the format, I'll send you the long
text.

It seems to be a defined name, perhaps some type of print area?


Looks to be a named range

What do you get if, on the first sheet, you just type into a cell
   =lsTipoEntidad

(lsTipoEntidad is the name of the range)


   [1]!tblTipoEntidad[Llave]


Thanks
Nick

-
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org



Re: 3.12 org.apache.poi.ss.formula.FormulaParseException

2015-12-03 Thread Nick Burch

On Thu, 3 Dec 2015, Brian Milnes wrote:

I have a strange bug that I can't quite grasp.

org.apache.poi.ss.formula.FormulaParseException: Unused input [[llave]]
after attempting to parse the formula [[1]!tbltipoentidad[llave]]


Do you know what the formula is supposed to mean, and what Excel evaluates 
it as?



I could not find the string involved with searching or dumping
strings/formulas using POI until I rewrote the sheet as a FOPS
and it is here in the XML.


Probably the best way to find it without changing things is to save as 
.xlsx (if not already), rename to .zip, unzip, and check the sheet xml 
files


Nick

-
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org



Re: 3.12 org.apache.poi.ss.formula.FormulaParseException

2015-12-03 Thread Nick Burch

On Thu, 3 Dec 2015, Brian Milnes wrote:

It evaluates to #NAME as it's a broken formula of some type.


If you take the same string, and type it into a new formula, can Excel 
cope? Does it accept it, or error?


Nick

-
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org



Re: Advice on tracking down an error thrown by evaluateAllFormulaCells()

2015-11-25 Thread Nick Burch

On Wed, 25 Nov 2015, Javen O'Neal wrote:

Nick Burch said:

See 
http://people.apache.org/~yegor/apachecon_us2010/Evaluation_Of_Excel_Formulas_In_POI.pptx
for information on how evaluation works, functions, ptgs etc, if it's new to you


This presentation would have been helpful when I was first learning
how POI parses formulas. I couldn't find a link to this presentation
on the POI website, and it has some content that isn't in [1], [2], or
[3], which were the resources I used to learn how POI parses formulas.
How should we make this information available on the POI website?


Ideally we'd want someone to take the information, and drop it into the 
relevant parts of the website. Maybe also tidy up a few bits of those 
pages at the same time, based on what's coming over.


Short term I guess we could add a link to it from the eval dev guide?


One for anyone, not only committers - any volunteers to go through the 
presentation, and write up on a per-slide basis something like: p2 already 
on eval page, p3 should go to eval devguide page, p4 out of date, etc ?


Thanks
Nick

-
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org



Re: Advice on tracking down an error thrown by evaluateAllFormulaCells()

2015-11-24 Thread Nick Burch

On Tue, 24 Nov 2015, Tom Chiverton wrote:
So, TREND() isn't implemented. Why don't I get a NotImplementedException 
then ?


I'll see if I can knock up a quick implementation to contribute.


POI only has some support for array functions, so I wonder if it's 
tripping up on that first?


A trend function would be wonderful if you could do one though! See
http://people.apache.org/~yegor/apachecon_us2010/Evaluation_Of_Excel_Formulas_In_POI.pptx
for information on how evaluation works, functions, ptgs etc, if it's new 
to you


Nick

-
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org



Re: Advice on tracking down an error thrown by evaluateAllFormulaCells()

2015-11-24 Thread Nick Burch

On Tue, 24 Nov 2015, Tom Chiverton wrote:

I am trying to evaluate all the formula's in an Excel file. There are about
a dozen sheets, with several tens of formula on each, all driven by a few
input fields on the first sheet.
This all works fine in Excel 365 itself.

However, when I try and run it via the latest POI,
evaluateAllFormulaCells() is throwing "Unexpected ptg class
(org.apache.poi.ss.formula.ptg.ArrayPtg)".

What's the best way to track this down ?


Do the same logic as that method does - loop over the sheets, then the 
rows, then the cells, and evaluate each cell one at a time. Use that to 
identify which cell, and hence which formula is the problem


Nick

-
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org



Re: Reading Excel 5.0/7.0 files

2015-11-23 Thread Nick Burch

On Mon, 23 Nov 2015, bestfrieend wrote:

can anyone please tell me how to read older excel files?


Apache POI doesn't have any high-level APIs for working with the older 
Excel file formats.


If you just want the textual contents, then there's 
org.apache.poi.hssf.extractor.OldExcelExtractor which will pull out the 
text and numbers from the file


If you need the values in specific cells, then you'll need to take an 
approach a bit like OldExcelExtractor, process the file at the record 
level, and check for the co-ordinates on OldStringRecord, NumberRecord, 
OldFormulaRecord and friends


Nick

-
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org



Re: Will poi library support reading encrypted excel file?

2015-11-18 Thread Nick Burch

On Wed, 18 Nov 2015, Ni Lei wrote:
I tried to read Excel files which are protected with password, but got 
the EncryptedDocumentException.


That's a sign that you need to have the password and to do the decryption!


Case 1: when reading .xlsx file:

Caused by: org.apache.poi.EncryptedDocumentException: The supplied spreadsheet 
seems to be an Encrypted .xlsx file. It must be decrypted before use by XSSF, 
it cannot be used by HSSF
   at 
org.apache.poi.hssf.usermodel.HSSFWorkbook.getWorkbookDirEntryName(HSSFWorkbook.java:252)
 ~[poi-3.11.jar:3.11]


This is entirely expected - you can't ask HSSFWorkbook (for reading XLS) 
files to process a password protected XLSX files for you! You need to 
decrypt it first, then give to XSSFWorkbook


see http://poi.apache.org/encryption.html


You can also supply the password when opening a workbook with 
WorkbookFactory too



The poi version is 3.11, and the password is set in MS Excel 2013, 
Windows 7.


Upgrade!

Nick

-
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org



Re: Data format

2015-11-13 Thread Nick Burch

On Thu, 12 Nov 2015, Javen O'Neal wrote:

In one POI project I'm working on, I am writing a date to a cell.
I'm using
CellStyle datetStyle = workbook.createCellStyle();
dateStyle.setDataFormat(22);

Is there a POI constant that I can use instead of the number 22?


org.apache.poi.ss.usermodel.BuiltinFormats ?

Though it might want extending to expose some of the more common ones in a 
more helpful constant-like way... Several of those get "localised" too, 
showing up as different things depending on the locale of the Excel 
program opening it. eg Format 0xe is defined as m/d/yy, but opening it up 
on in a UK-English Excel shows as dd/mm/


Maybe we could combine providing more helpful constants with also 
capturing the locale-specific localised versions too? (IIRC it's only 
these built-in formats that get magically localised like that)


Nick

-
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org



Re: InvalidFormatException with OPCPackage or XSSFWorkbook

2015-11-05 Thread Nick Burch

On Thu, 5 Nov 2015, Jonathan Hodges wrote:

Yes this file is from online and likely generated in a non-Excel system.
When I re-saved the file it works.

All this makes sense why it isn't working, but is there any other work
around besides re-saving via Excel, preferably a programmatic option?


Open a bug in bugzilla, attach a small file that shows the problem, and a 
short unit test that triggers it. We can then take a look to see if we can 
safely relax the compliance check that this file is failing on


Nick

-
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org



Re: InvalidFormatException with OPCPackage or XSSFWorkbook

2015-11-05 Thread Nick Burch

On Thu, 5 Nov 2015, Jonathan Hodges wrote:
When I attempt to open an Excel file (.xlsx) file I receive the 
following stack trace.


org.apache.poi.openxml4j.exceptions.InvalidFormatException: The part
/_rels/.rels does not have any content type ! Rule: Package require content
types when retrieving a part from a package. [M.1.14]
 at
org.apache.poi.openxml4j.opc.ZipPackage.getPartsImpl(ZipPackage.java:247)
 at org.apache.poi.openxml4j.opc.OPCPackage.getParts(OPCPackage.java:684)
 at org.apache.poi.openxml4j.opc.OPCPackage.open(OPCPackage.java:275)


Looks like a broken / invalid file to me. Do you know where it comes from? 
Perhaps generated by a non-Excel system?


If you open the file in Excel, do a save-as, does the resaved file work 
properly?


Nick

-
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org



Re: Formula not working as expected

2015-11-05 Thread Nick Burch

On Thu, 5 Nov 2015, Jason Tomforde wrote:

A1 string cell = 1
A2 numeric cell = 2
A3 numeric cell = 3
A4 formula cell = IF(A1=1,A2,A3)

The result I am getting is 0.

Any thoughts?


Are you evaluating the formula when you're doing with settings cells? 
http://poi.apache.org/spreadsheet/eval.html


Are you using a new enough version of Apache POI? 
http://poi.apache.org/download.html


Nick

-
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org



RE: Drawing Borders is SLOW

2015-11-05 Thread Nick Burch

On Thu, 5 Nov 2015, Murphy, Mark wrote:
My thought was to build the borders on their own, then once the borders 
are created, apply them to the cell styles in a single step. This would 
require a few new objects.


You are going to have to change poi.apache.ss.util.CellUtil.setCellStyleProperty


Oh, you're using CellUtil! That does change things

I don't see why we couldn't add a method onto CellUtil which took muliple 
properties, and found/created a style with all of those on in one go


That's different to working on the CellStyle objects directly, which is 
what I thought you were doing



Please feel free to create an enhancement in bugzilla for this (stressing 
it's for CellUtil not CellStyle, to avoid others getting confused), then 
even better work up a patch if you can!


Nick

-
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org



Re: moving formulas, or, what version of Poi should I use?

2015-11-05 Thread Nick Burch

On Thu, 5 Nov 2015, Ralph Johnson wrote:

When I started looking for info on changing formulas in Poi, I realized
that 3.6 is woefully out of date and that 3.13 has much better support for
updating cell references when you insert rows.  So, I installed 3.13 and
made sure the system worked with it.   But 3.13 doesn't match the javadocs
I find for Poi on the web.


JavaDocs on the website refer to trunk. For the javadocs that apply to 
3.13, those are included in the 3.13 binary download


In particular, the javadoc says that XSSFSheet has a copyRows method 
that takes a CellCopyPolicy as an argument, but it doesn't have that 
method in 3.13, and I can't find that class.


Added recently, see the changelog  for 
details


There should hopefully be a 3.14 beta 1 release quite soon, which will be 
the first release with that functionality in. For now, you'd need to 
checkout from svn/git and build yourself


Nick

-
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org



Re: Excel Files using XSSFWorkbook over a certain size display an excel error when I open the .xls file in excel

2015-11-04 Thread Nick Burch

On Wed, 4 Nov 2015, Schene, Chris wrote:

I am using the code below to create a very large spread sheet that is 47
rows wide.

There are a few very large strings in the rows, but for the most part the
data is fairly small.

If the .xls file is over about  1000 rows I get an error when I load the
.xls file in excel saying it needs to repair the file and I lose all the
row color and column width settings.


Cell Styles are workbook scoped, not cell scoped. Make sure you create all 
of your styles outside of the loop, and re-use them. If you create one 
style per cell, you'll use too many and Excel will sulk


Nick

-
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org



Re: Drawing Borders is SLOW

2015-11-03 Thread Nick Burch

On Tue, 3 Nov 2015, Murphy, Mark wrote:
I am sure you all know this. But the problem increases as the number of 
styles grows. In looking at the code, I am convinced that the problem 
can be found in the fact that when borders are drawn, the cell style is 
retrieved, the border is applied, and all styles are searched for a 
matching style. I one is not found, then a new one is created


This is the bit where I'm loosing you. Surely you create a dozen or so 
styles at the start of creating your workbook, with the various colours 
and borders that you want, then you simply apply them to your cells as you 
work through creating your workbook. You shouldn't need to be creating 
styles as you go, adding various bits of borders in to them.


Styles in Excel are, due to how the file format works, scoped at the 
workbook level and not the cell level. You shouldn't therefore be creating 
styles as you go, or you'll run out of available styles!


Nick

-
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org



Re: method to return POI version?

2015-10-31 Thread Nick Burch

On Fri, 30 Oct 2015, Philip Nienhuis wrote:

Is there a method that will return the version of Apache POI?


Does the org.apache.poi.Version class not cover you?

Nick

-
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org



Re: POI XSSF and ClipBoard issue

2015-10-22 Thread Nick Burch

On Thu, 22 Oct 2015, Sam' wrote:

Yes the attachment is in XSSF format. I produced the file without POI. That
means I opened Excel on my computer, then fill the cell and add a comment
and saved it.

Then I opened it again, clicked on the first cell and did a copy of it.
Then after, I do a "paste" in my Java application and I'm trying to decode
what's in it, and give it to POI.


Ah, that explains why things might be being a bit odd. (It might be like 
password protected files, which get wrapped up in odd ways, or might be 
something else...)



So technically, I am not providing a full spreadsheet file. Simply
extracting what Excel has put into the clipboard. And apparently something
was put in it into a Biff8 format. And that "something" was ok for POI. Now
that if I'm extracting from the clipboard the "XML Spreadsheet" content and
I give it to POI, it's not working.

But maybe the cause is that : what is inside the clipboard is not a fully
constructed content and POI doesn't recognize that as an Excel file.


It probably isn't a full file. Quite what it is, and how close, I'm not 
sure. It'll be interesting to find out!


I'd suggest you open an enhancement ticket in bugzilla. Then, create a 
simple spreadsheet in XLS and XLSX format, and attach those. Next, copy a 
cell, paste it into your app, save that to disk, then repeat for the other 
format. Upload those two "raw" clipboard files. Finally, run the Apache 
Tika App in --detect mode on the two clipboard files and report what that 
thinks they are


You'll likely need to do much of the work yourself, but with the above we 
might be able to give you some pointers!


Nick

-
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org



Re: Zoom settings for XWPF?

2015-09-01 Thread Nick Burch

On Fri, 28 Aug 2015, Mark Beardsley wrote:

Thanks Dominic, I will not bother digging any further into the api to
discover how to convert from the string descriptor into the relationship ID,
how to get this "rId4", from this
"http://schemas.openxmlformats.org/officeDocument/2006/relationships/settings;.
I am not happy assuming that the settings part of the archive is always
referenced by the ID 'rId4' either.


You should fetch, using the OPC package part support, the relationship 
from the document part to the settings part. That'll give you the 
relationship ID to use. I think POIXMLDocumentPart should have a method to 
get you that


Nick

-
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org



Re: Word to OOXML Conversion using Apache POI

2015-07-29 Thread Nick Burch

On Wed, 29 Jul 2015, Sarfaraz Husain wrote:
Is there a way to convert the Word Document to OOXML using Apache POI 
and save this XML?


As in to convert a Word .doc file into a Word .docx file? (.docx files are 
OOXML-based)


If so, sadly the answer is no using Apache POI. It would be nice to have a 
converter like that, but thus far no-one has volunteered to put in the 
time to write one. If someone wanted to contribute one, the Word to FO / 
HTML converters would probably be a good place to start.


For now, using something like OpenOffice (probably via jodconverter) is 
likely to be your best bet


Nick

-
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org



Re: Word to OOXML Conversion using Apache POI

2015-07-29 Thread Nick Burch

On Wed, 29 Jul 2015, Sarfaraz Husain wrote:

I am looking a way to convert Word .docx file to XML with all information
in that XML like wordML, images and other relation xml etc. all in one XML
file.


.docx files are already XML, it's a zip file with several different xml 
files in there (content, styles etc), all in the OOXML-defined structures 
and format


Nick

-
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org



Re: Apache POI is not supporting MS Word 2013(Strict Open XML Document) - .docx

2015-07-09 Thread Nick Burch

On Thu, 9 Jul 2015, rathnamm wrote:
If I save a MS Word document as Strict Open XML Document(*.docx) then I 
am not able to process this document. I am getting below exception when 
I pass this document in XWPFDocument(InsputStream) code.


What version of Apache POI are you trying this with? And if it isn't the 
most recent, what happens when you upgrade?


Nick

-
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org



Re: Apache POI is not supporting MS Word 2013(Strict Open XML Document) - .docx

2015-07-09 Thread Nick Burch

On Thu, 9 Jul 2015, rathnamm wrote:
I am using Apache POI 3.8 but I tried with Apache POI 3.10 as well but 
it is not working.


And what happens if you try the current latest release? (Those are both 
old releases)


Nick

-
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org



Re: Apache POI

2015-07-02 Thread Nick Burch

On Wed, 1 Jul 2015, karthikeyan S wrote:

Using Apache poi How many row and column to read in excel .xlsx format??


Did you try looking at the Iterating over rows and cells part of the 
documentation?

http://poi.apache.org/spreadsheet/quick-guide.html#Iterator

Nick

-
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org



Re: POI stop no response and errors

2015-06-05 Thread Nick Burch

On Fri, 5 Jun 2015, donli wrote:

System.out.println(Start to parse DOC file.);
POIFSFileSystem fs = new POIFSFileSystem(u.openStream());
System.out.println(File opened.);
I run this code in a loop, open the word document and parse the document.
Sometimes the code stopped at new POIFSFileSystem(u.openStream()) without
any warnings and errors for several hours. I have to restart the program.


Try with NPOIFSFileSystem. Also, try spooling the remote stream to a file 
and read from there, in case it's your remote server timing out


Nick

-
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org



Re: WorkbookFactory.create taking long time

2015-05-26 Thread Nick Burch

On Tue, 26 May 2015, Vipin wrote:

Workbook wb = 
WorkbookFactory.create(attachment.getDataHandler.getInputStream());

This call takes around 1400 ms to read 10k rows , each row has 4 column 
and there is no formula. All columns have numbers and strings only.


I'd suggest you start with the advice in 
http://poi.apache.org/faq.html#faq-N10109, as well as ensuring that the 
delay isn't actually reading your attachment over the network!


There's also http://poi.apache.org/spreadsheet/how-to.html#xssf_sax_api if 
you really can't use XSSF usermodel, but for that kind of file it's 
unlikely


-
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org



[Announce] New Committer and PMC Member, David North

2015-05-21 Thread Nick Burch

Hi All

On behalf of the Apache POI PMC, we're pleased to announce that David 
North has been elected as a committer and PMC member for the project!


For anyone else thinking of getting involved, there's some information 
available on the project website at http://poi.apache.org/guidelines.html 
along with some general advice at http://community.apache.org/newcomers/ . 
The project welcomes all kinds of contributions, from patches to 
documentation to community support to improvements to fixes, all we ask is 
that you make a positive contribution for a sustained period. Hopefully 
some more of you can be added in this year too :)


Nick

-
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org



Re: setting up a header

2015-05-14 Thread Nick Burch

On Thu, 14 May 2015, parthiprft wrote:
I'm using 3.11 version of POI, I have to set an header in an excel 
sheet(this header should be visible even if I scroll up or scroll down), 
is it possible ?


See http://poi.apache.org/spreadsheet/quick-guide.html#Splits

Nick

-
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org



Re: NPE on OPCPackage

2015-05-11 Thread Nick Burch

On Mon, 11 May 2015, João Miguel Penetra wrote:

I have a web site where users can upload CSV and XLSX files and I use
Apache-POI to parse XLSX files. However, I discovered something unusual.
When I convert a CSV file to XLSX on Excel 2013 I get a NPE while
initializing XSSFReader:

java.lang.NullPointerException
at org.apache.poi.openxml4j.opc.OPCPackage.getPart(OPCPackage.java:625)
at org.apache.poi.xssf.eventusermodel.XSSFReader.init(XSSFReader.java:67)


That's an odd one. Any chance you could raise a new bug for this in 
bugzilla - http://issues.apache.org/bugzilla/buglist.cgi?product=POI - and 
upload the excel file that triggers it?


Nick

-
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org

Re: java.lang.reflect.InvocationTargetException in pptx conversion using poi

2015-05-08 Thread Nick Burch

On Fri, 8 May 2015, Shiva Kumar wrote:

Please give me a solution. Is there any class that I can override solve the
issue if not possible please take this as a bug report and solve this.

Caused by: java.lang.NoClassDefFoundError:
org/openxmlformats/schemas/presentationml/x2006/main/SldMasterDocument$Facto
ry


See this faq: http://poi.apache.org/faq.html#faq-N10025

Nick

-
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org



Re: Reading XLSX

2015-04-30 Thread Nick Burch

On Thu, 30 Apr 2015, João Miguel Penetra wrote:
This guide has been very useful and I've managed to reduce the memory 
problems and optimize the performance but now I have issues parsing the 
data. I have these two examples:


c r=B2 s=3v33015/v/c
and
c r=D2 s=1v91001/v/c

In the first example I figured it out that I could
use HSSFDateUtil.getJavaDate in order to get a proper date but how can I
know I'm looking at a date and not other thing? How can I establish a
connection between the s attribute and the type of cell I'm looking at?


You'll need to look at the cell styles to work that out.

Two good examples of doing this are:
https://svn.apache.org/repos/asf/poi/trunk/src/examples/src/org/apache/poi/xssf/eventusermodel/XLSX2CSV.java
https://svn.apache.org/repos/asf/tika/trunk/tika-parsers/src/main/java/org/apache/tika/parser/microsoft/ooxml/XSSFExcelExtractorDecorator.java

Nick

-
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org

Re: Remove Invalid XML in DOCX

2015-04-27 Thread Nick Burch

On Mon, 27 Apr 2015, Michael Nguyen wrote:
We convert MS-WORD documents to DOCX using LibreOffice (clunky), but 
some files are unreadable because they contain invalid UTF-8 characters 
in the XML that version 1.0 and 1.1 of XML do not like.


Your best long term fix is to report the bug to Apache OpenOffice, get it 
fixed there, then wait for LibreOffice to accept the fix.


LibreOffice does not care, but we need to read these documents into POI. 
Short of disassembling the archive file and editing the appropriate XML 
files in the container, I was wondering if there was a way to edit the 
PackagePart data for the relevant bits (it's the word/document.xml this 
is occurring in most frequently).  The PackagePart API makes it unclear 
how to read the XML into memory and edit, then re-write to the part.


Once you have a PackagePart, call getInputStream() to read the contents. 
Work you want through that updating / fixing things. Possibly use IOUtils 
to get the stream as a byte array. When done, call getOutputStream() and 
write the new contents into it, then save the overall package


If you have invalid XML, you can't fix it at the XML level, you'll need to 
fix it at the byte level


Nick

-
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org



Re: Cell Type - Numbering Format

2015-04-14 Thread Nick Burch

On Tue, 14 Apr 2015, mskavim wrote:

I am creating Bill Of Material report using Apache POI. I have a column for
Part Numbers... there are few occasion where P/N has 0 in the prefix like

000501
000502

... so on. if I set the Cell Type as Numeric.  The prefix 0 is getting
stripped.  is there alternative solution for this scenario?


What happens if you set a cell format which enforces a minimum of 6 zeros? 
(Use the same format you'd set in Excel for that)


Nick

-
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org



  1   2   3   4   5   6   7   8   9   10   >