PJ,
 I think I found the issue and I believe this would constitute a bug but I
would like your confirmation. I narrowed down the problem to lines 237-239
in CellUtil

if (DateUtil.isCellDateFormatted(srcCell)) {
       destCell.setCellValue(srcCell.getDateCellValue());
}

where in my case destCell is an instance of XSSFCell where I assume
(although I could not track exactly) this set method calls the set method
in XSSFCell on lines 460-463

protected void setCellValueImpl(Date value) {
        boolean date1904 = this.getSheet().getWorkbook().isDate1904();
        this.setCellValue(DateUtil.getExcelDate(value, date1904));
}

The third column of my data was only a time (e.g. in java.util.Date form: Sun
Dec 31 09:53:44 UTC 1899) which per the Javadocs on DateUtil.getExcelDate is
not a valid date hence -1 was returned and thereby any subsequent call
to getDateCellValue()
returned null as no date was ever set.

I was able to get copyCell working with the following in the switch
statement

case NUMERIC:
      destCell.setCellValue(srcCell.getNumericCellValue());
      break;

without the call to the setCellValue method that takes a java.util.Date thereby
bypassing the call of DateUtil.getExcelDate.
Please let me know your thoughts on whether copyCell should be changed or
not. Thanks!

On Wed, Feb 12, 2025 at 11:20 AM Dan S <dsti...@gmail.com> wrote:

> With some System.out statements this is what the resulting sheet in the
> workbook looks like. I would have thought all cell 2 columns except for row
> 1
> should have had a value which is a date. Instead I see a negative -1 value.
>
> row 0 cell 0 cell type STRING value transaction_id
> row 0 cell 1 cell type STRING value transaction_date
> row 0 cell 2 cell type STRING value transaction_time
> row 1 cell 0 cell type NUMERIC value 75.0
> row 1 cell 1 cell type NUMERIC value Sun Jan 01 00:00:00 UTC 2023
> row 1 cell 2 cell type NUMERIC value -1.0
> row 2 cell 0 cell type NUMERIC value 78.0
> row 2 cell 1 cell type NUMERIC value Sun Jan 01 00:00:00 UTC 2023
> row 2 cell 2 cell type NUMERIC value -1.0
> row 3 cell 0 cell type NUMERIC value 80.0
> row 3 cell 1 cell type NUMERIC value Sun Jan 01 00:00:00 UTC 2023
> row 3 cell 2 cell type NUMERIC value -1.0
> row 4 cell 0 cell type NUMERIC value 81.0
> row 4 cell 1 cell type NUMERIC value Sun Jan 01 00:00:00 UTC 2023
> row 4 cell 2 cell type NUMERIC value -1.0
> row 5 cell 0 cell type NUMERIC value 82.0
> row 5 cell 1 cell type NUMERIC value Sun Jan 01 00:00:00 UTC 2023
> row 5 cell 2 cell type NUMERIC value -1.0
>
> On Wed, Feb 12, 2025 at 10:08 AM Dan S <dsti...@gmail.com> wrote:
>
>> Oops sorry for the typo it should read
>> In other words the third column has time values and when read without my
>> code are numeric cell types AND ARE date formatted but after running
>> through my code they seem to only be numeric cell types and NOT date
>> formatted.
>>
>> On Wed, Feb 12, 2025 at 10:06 AM Dan S <dsti...@gmail.com> wrote:
>>
>>> In other words the third column has time values and when read without my
>>> code are numeric cell types are not date formatted but after running
>>> through my code they seem to only be numeric cell types and not date
>>> formatted.
>>>
>>> On Wed, Feb 12, 2025 at 9:24 AM Dan S <dsti...@gmail.com> wrote:
>>>
>>>> PJ,
>>>>  I realize  it handles formatted dates. My question is that it appears
>>>> my changes only handle one of the columns with dates but not the other. I
>>>> understood from you the changes I made should have handled both. Please
>>>> advise.
>>>>
>>>> On Wed, Feb 12, 2025 at 6:36 AM PJ Fanning <fannin...@apache.org>
>>>> wrote:
>>>>
>>>>> I have some tests that appear to show that that test class I
>>>>> referenced earlier does handle formatted dates. I added an extra one
>>>>> today.
>>>>>
>>>>>
>>>>> https://github.com/pjfanning/excel-streaming-reader/commit/d26bdb3be3aac829a77ec6912bb63811c4812e06
>>>>>
>>>>> On Wed, 12 Feb 2025 at 05:27, Dan S <dsti...@gmail.com> wrote:
>>>>> >
>>>>> > Thank you very much for that example. It was very helpful.  Based on
>>>>> the example,  I changed my code, changing back the cellStyle to true in 
>>>>> the
>>>>> CellCopyPolicy and I refactored the method copyRows from
>>>>> org.apache.poi.xssf.usermodel.XSSFSheet and method copyRowFrom from
>>>>> org.apache.poi.xssf.usermodel.XSSFRow instantiating a CellCopyContext in
>>>>> copyRows. The problem though is when I ran the following worksheet 
>>>>> through,
>>>>> the resulting workbook with a single sheet has date formatted columns (as
>>>>> expected) in the column with header transaction_date is but the values in
>>>>> the column header transaction time is not date formatted where it should 
>>>>> be
>>>>> (as it is in the original file). Attached is my new code and the sample
>>>>> file. Please advise. Thank you so much!
>>>>> >
>>>>> > On Tue, Feb 11, 2025 at 4:18 PM PJ Fanning <fannin...@apache.org>
>>>>> wrote:
>>>>> >>
>>>>> >> That copyRows method doesn't allow you to specify the
>>>>> CellCopyContext.
>>>>> >> We probably should add a variant of copyRows that supports this
>>>>> extra
>>>>> >> param. The context keeps track of styles so that it avoids adding
>>>>> the
>>>>> >> same style over and over - i.e. it spots the duplicate styles.
>>>>> >>
>>>>> >> This example might be useful
>>>>> >>
>>>>> https://github.com/pjfanning/excel-streaming-reader/blob/main/src/test/java/com/github/pjfanning/xlsx/CopyToSXSSFUtil.java
>>>>> >>
>>>>> >> CellUtil.copyCell - this method can take a CellCopyContext instance.
>>>>> >>
>>>>> >> On Tue, 11 Feb 2025 at 17:48, Dan S <dsti...@gmail.com> wrote:
>>>>> >> >
>>>>> >> > I hope I am addressing the right list. I am an Apache NIFI
>>>>> developer and I
>>>>> >> > have developed an Apache NIFI processor named SplitExcel which
>>>>> splits a
>>>>> >> > multi sheet Microsoft Excel spreadsheet into multiple Microsoft
>>>>> Excel
>>>>> >> > spreadsheets where each sheet from the original file is converted
>>>>> to an
>>>>> >> > individual spreadsheet. Source code for this can be found here
>>>>> >> > <
>>>>> https://github.com/apache/nifi/blob/main/nifi-extension-bundles/nifi-poi-bundle/nifi-poi-services/src/main/java/org/apache/nifi/processors/excel/SplitExcel.java
>>>>> >
>>>>> >> > (lines
>>>>> >> > 165-188).
>>>>> >> > In order to accomplish this, I use the
>>>>> >> > com.github.pjfanning.xlsx.StreamingReader to stream the Excel
>>>>> workbook
>>>>> >> > which allows for easy looping over each sheet. I create a new
>>>>> XSSFWorkbook
>>>>> >> > for each sheet there is in the existing workbook and I use
>>>>> XSSFSheet
>>>>> >> > copyRows method to copy the rows from the existing sheet to a new
>>>>> sheet to
>>>>> >> > place in the new XSSFWorkbook. The CellCopyPolicy used has
>>>>> cellStyle set to
>>>>> >> > false in order to avoid exceeding the maximum number of cell
>>>>> styles (fixed
>>>>> >> > in NIFI-13726 <https://issues.apache.org/jira/browse/NIFI-13726>).
>>>>> The
>>>>> >> > drawback though of using cellStyle set to false is that any
>>>>> columns which
>>>>> >> > have dates, its date formatting is lost and what is left is a
>>>>> meaningless
>>>>> >> > number (reported in NIFI-14106
>>>>> >> > <https://issues.apache.org/jira/browse/NIFI-14106>). Is there a
>>>>> way I can
>>>>> >> > use copyRows not to exceed the maximum number of cell styles and
>>>>> yet have
>>>>> >> > the formatting needed for dates?
>>>>> >
>>>>> >
>>>>> > ---------------------------------------------------------------------
>>>>> > To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
>>>>> > For additional commands, e-mail: user-h...@poi.apache.org
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
>>>>> For additional commands, e-mail: user-h...@poi.apache.org
>>>>>
>>>>>

Reply via email to