PJ, I think I found the issue and I believe this would constitute a bug but I would like your confirmation. I narrowed down the problem to lines 237-239 in CellUtil
if (DateUtil.isCellDateFormatted(srcCell)) { destCell.setCellValue(srcCell.getDateCellValue()); } where in my case destCell is an instance of XSSFCell where I assume (although I could not track exactly) this set method calls the set method in XSSFCell on lines 460-463 protected void setCellValueImpl(Date value) { boolean date1904 = this.getSheet().getWorkbook().isDate1904(); this.setCellValue(DateUtil.getExcelDate(value, date1904)); } The third column of my data was only a time (e.g. in java.util.Date form: Sun Dec 31 09:53:44 UTC 1899) which per the Javadocs on DateUtil.getExcelDate is not a valid date hence -1 was returned and thereby any subsequent call to getDateCellValue() returned null as no date was ever set. I was able to get copyCell working with the following in the switch statement case NUMERIC: destCell.setCellValue(srcCell.getNumericCellValue()); break; without the call to the setCellValue method that takes a java.util.Date thereby bypassing the call of DateUtil.getExcelDate. Please let me know your thoughts on whether copyCell should be changed or not. Thanks! On Wed, Feb 12, 2025 at 11:20 AM Dan S <dsti...@gmail.com> wrote: > With some System.out statements this is what the resulting sheet in the > workbook looks like. I would have thought all cell 2 columns except for row > 1 > should have had a value which is a date. Instead I see a negative -1 value. > > row 0 cell 0 cell type STRING value transaction_id > row 0 cell 1 cell type STRING value transaction_date > row 0 cell 2 cell type STRING value transaction_time > row 1 cell 0 cell type NUMERIC value 75.0 > row 1 cell 1 cell type NUMERIC value Sun Jan 01 00:00:00 UTC 2023 > row 1 cell 2 cell type NUMERIC value -1.0 > row 2 cell 0 cell type NUMERIC value 78.0 > row 2 cell 1 cell type NUMERIC value Sun Jan 01 00:00:00 UTC 2023 > row 2 cell 2 cell type NUMERIC value -1.0 > row 3 cell 0 cell type NUMERIC value 80.0 > row 3 cell 1 cell type NUMERIC value Sun Jan 01 00:00:00 UTC 2023 > row 3 cell 2 cell type NUMERIC value -1.0 > row 4 cell 0 cell type NUMERIC value 81.0 > row 4 cell 1 cell type NUMERIC value Sun Jan 01 00:00:00 UTC 2023 > row 4 cell 2 cell type NUMERIC value -1.0 > row 5 cell 0 cell type NUMERIC value 82.0 > row 5 cell 1 cell type NUMERIC value Sun Jan 01 00:00:00 UTC 2023 > row 5 cell 2 cell type NUMERIC value -1.0 > > On Wed, Feb 12, 2025 at 10:08 AM Dan S <dsti...@gmail.com> wrote: > >> Oops sorry for the typo it should read >> In other words the third column has time values and when read without my >> code are numeric cell types AND ARE date formatted but after running >> through my code they seem to only be numeric cell types and NOT date >> formatted. >> >> On Wed, Feb 12, 2025 at 10:06 AM Dan S <dsti...@gmail.com> wrote: >> >>> In other words the third column has time values and when read without my >>> code are numeric cell types are not date formatted but after running >>> through my code they seem to only be numeric cell types and not date >>> formatted. >>> >>> On Wed, Feb 12, 2025 at 9:24 AM Dan S <dsti...@gmail.com> wrote: >>> >>>> PJ, >>>> I realize it handles formatted dates. My question is that it appears >>>> my changes only handle one of the columns with dates but not the other. I >>>> understood from you the changes I made should have handled both. Please >>>> advise. >>>> >>>> On Wed, Feb 12, 2025 at 6:36 AM PJ Fanning <fannin...@apache.org> >>>> wrote: >>>> >>>>> I have some tests that appear to show that that test class I >>>>> referenced earlier does handle formatted dates. I added an extra one >>>>> today. >>>>> >>>>> >>>>> https://github.com/pjfanning/excel-streaming-reader/commit/d26bdb3be3aac829a77ec6912bb63811c4812e06 >>>>> >>>>> On Wed, 12 Feb 2025 at 05:27, Dan S <dsti...@gmail.com> wrote: >>>>> > >>>>> > Thank you very much for that example. It was very helpful. Based on >>>>> the example, I changed my code, changing back the cellStyle to true in >>>>> the >>>>> CellCopyPolicy and I refactored the method copyRows from >>>>> org.apache.poi.xssf.usermodel.XSSFSheet and method copyRowFrom from >>>>> org.apache.poi.xssf.usermodel.XSSFRow instantiating a CellCopyContext in >>>>> copyRows. The problem though is when I ran the following worksheet >>>>> through, >>>>> the resulting workbook with a single sheet has date formatted columns (as >>>>> expected) in the column with header transaction_date is but the values in >>>>> the column header transaction time is not date formatted where it should >>>>> be >>>>> (as it is in the original file). Attached is my new code and the sample >>>>> file. Please advise. Thank you so much! >>>>> > >>>>> > On Tue, Feb 11, 2025 at 4:18 PM PJ Fanning <fannin...@apache.org> >>>>> wrote: >>>>> >> >>>>> >> That copyRows method doesn't allow you to specify the >>>>> CellCopyContext. >>>>> >> We probably should add a variant of copyRows that supports this >>>>> extra >>>>> >> param. The context keeps track of styles so that it avoids adding >>>>> the >>>>> >> same style over and over - i.e. it spots the duplicate styles. >>>>> >> >>>>> >> This example might be useful >>>>> >> >>>>> https://github.com/pjfanning/excel-streaming-reader/blob/main/src/test/java/com/github/pjfanning/xlsx/CopyToSXSSFUtil.java >>>>> >> >>>>> >> CellUtil.copyCell - this method can take a CellCopyContext instance. >>>>> >> >>>>> >> On Tue, 11 Feb 2025 at 17:48, Dan S <dsti...@gmail.com> wrote: >>>>> >> > >>>>> >> > I hope I am addressing the right list. I am an Apache NIFI >>>>> developer and I >>>>> >> > have developed an Apache NIFI processor named SplitExcel which >>>>> splits a >>>>> >> > multi sheet Microsoft Excel spreadsheet into multiple Microsoft >>>>> Excel >>>>> >> > spreadsheets where each sheet from the original file is converted >>>>> to an >>>>> >> > individual spreadsheet. Source code for this can be found here >>>>> >> > < >>>>> https://github.com/apache/nifi/blob/main/nifi-extension-bundles/nifi-poi-bundle/nifi-poi-services/src/main/java/org/apache/nifi/processors/excel/SplitExcel.java >>>>> > >>>>> >> > (lines >>>>> >> > 165-188). >>>>> >> > In order to accomplish this, I use the >>>>> >> > com.github.pjfanning.xlsx.StreamingReader to stream the Excel >>>>> workbook >>>>> >> > which allows for easy looping over each sheet. I create a new >>>>> XSSFWorkbook >>>>> >> > for each sheet there is in the existing workbook and I use >>>>> XSSFSheet >>>>> >> > copyRows method to copy the rows from the existing sheet to a new >>>>> sheet to >>>>> >> > place in the new XSSFWorkbook. The CellCopyPolicy used has >>>>> cellStyle set to >>>>> >> > false in order to avoid exceeding the maximum number of cell >>>>> styles (fixed >>>>> >> > in NIFI-13726 <https://issues.apache.org/jira/browse/NIFI-13726>). >>>>> The >>>>> >> > drawback though of using cellStyle set to false is that any >>>>> columns which >>>>> >> > have dates, its date formatting is lost and what is left is a >>>>> meaningless >>>>> >> > number (reported in NIFI-14106 >>>>> >> > <https://issues.apache.org/jira/browse/NIFI-14106>). Is there a >>>>> way I can >>>>> >> > use copyRows not to exceed the maximum number of cell styles and >>>>> yet have >>>>> >> > the formatting needed for dates? >>>>> > >>>>> > >>>>> > --------------------------------------------------------------------- >>>>> > To unsubscribe, e-mail: user-unsubscr...@poi.apache.org >>>>> > For additional commands, e-mail: user-h...@poi.apache.org >>>>> >>>>> --------------------------------------------------------------------- >>>>> To unsubscribe, e-mail: user-unsubscr...@poi.apache.org >>>>> For additional commands, e-mail: user-h...@poi.apache.org >>>>> >>>>>