PJ,
 I just wanted to follow up on what you mentioned the other day

That copyRows method doesn't allow you to specify the CellCopyContext.
> We probably should add a variant of copyRows that supports this extra
> param. The context keeps track of styles so that it avoids adding the
> same style over and over - i.e. it spots the duplicate styles.


Can you please make a ticket for this also? Thanks!

On Wed, Feb 12, 2025 at 5:10 PM PJ Fanning <fannin...@apache.org> wrote:

> I logged https://bz.apache.org/bugzilla/show_bug.cgi?id=69583
>
> On Wed, 12 Feb 2025 at 21:00, Dan S <dsti...@gmail.com> wrote:
> >
> > PJ,
> >  I think I found the issue and I believe this would constitute a bug but
> I
> > would like your confirmation. I narrowed down the problem to lines
> 237-239
> > in CellUtil
> >
> > if (DateUtil.isCellDateFormatted(srcCell)) {
> >        destCell.setCellValue(srcCell.getDateCellValue());
> > }
> >
> > where in my case destCell is an instance of XSSFCell where I assume
> > (although I could not track exactly) this set method calls the set method
> > in XSSFCell on lines 460-463
> >
> > protected void setCellValueImpl(Date value) {
> >         boolean date1904 = this.getSheet().getWorkbook().isDate1904();
> >         this.setCellValue(DateUtil.getExcelDate(value, date1904));
> > }
> >
> > The third column of my data was only a time (e.g. in java.util.Date
> form: Sun
> > Dec 31 09:53:44 UTC 1899) which per the Javadocs on
> DateUtil.getExcelDate is
> > not a valid date hence -1 was returned and thereby any subsequent call
> > to getDateCellValue()
> > returned null as no date was ever set.
> >
> > I was able to get copyCell working with the following in the switch
> > statement
> >
> > case NUMERIC:
> >       destCell.setCellValue(srcCell.getNumericCellValue());
> >       break;
> >
> > without the call to the setCellValue method that takes a java.util.Date
> thereby
> > bypassing the call of DateUtil.getExcelDate.
> > Please let me know your thoughts on whether copyCell should be changed or
> > not. Thanks!
> >
> > On Wed, Feb 12, 2025 at 11:20 AM Dan S <dsti...@gmail.com> wrote:
> >
> > > With some System.out statements this is what the resulting sheet in the
> > > workbook looks like. I would have thought all cell 2 columns except
> for row
> > > 1
> > > should have had a value which is a date. Instead I see a negative -1
> value.
> > >
> > > row 0 cell 0 cell type STRING value transaction_id
> > > row 0 cell 1 cell type STRING value transaction_date
> > > row 0 cell 2 cell type STRING value transaction_time
> > > row 1 cell 0 cell type NUMERIC value 75.0
> > > row 1 cell 1 cell type NUMERIC value Sun Jan 01 00:00:00 UTC 2023
> > > row 1 cell 2 cell type NUMERIC value -1.0
> > > row 2 cell 0 cell type NUMERIC value 78.0
> > > row 2 cell 1 cell type NUMERIC value Sun Jan 01 00:00:00 UTC 2023
> > > row 2 cell 2 cell type NUMERIC value -1.0
> > > row 3 cell 0 cell type NUMERIC value 80.0
> > > row 3 cell 1 cell type NUMERIC value Sun Jan 01 00:00:00 UTC 2023
> > > row 3 cell 2 cell type NUMERIC value -1.0
> > > row 4 cell 0 cell type NUMERIC value 81.0
> > > row 4 cell 1 cell type NUMERIC value Sun Jan 01 00:00:00 UTC 2023
> > > row 4 cell 2 cell type NUMERIC value -1.0
> > > row 5 cell 0 cell type NUMERIC value 82.0
> > > row 5 cell 1 cell type NUMERIC value Sun Jan 01 00:00:00 UTC 2023
> > > row 5 cell 2 cell type NUMERIC value -1.0
> > >
> > > On Wed, Feb 12, 2025 at 10:08 AM Dan S <dsti...@gmail.com> wrote:
> > >
> > >> Oops sorry for the typo it should read
> > >> In other words the third column has time values and when read without
> my
> > >> code are numeric cell types AND ARE date formatted but after running
> > >> through my code they seem to only be numeric cell types and NOT date
> > >> formatted.
> > >>
> > >> On Wed, Feb 12, 2025 at 10:06 AM Dan S <dsti...@gmail.com> wrote:
> > >>
> > >>> In other words the third column has time values and when read
> without my
> > >>> code are numeric cell types are not date formatted but after running
> > >>> through my code they seem to only be numeric cell types and not date
> > >>> formatted.
> > >>>
> > >>> On Wed, Feb 12, 2025 at 9:24 AM Dan S <dsti...@gmail.com> wrote:
> > >>>
> > >>>> PJ,
> > >>>>  I realize  it handles formatted dates. My question is that it
> appears
> > >>>> my changes only handle one of the columns with dates but not the
> other. I
> > >>>> understood from you the changes I made should have handled both.
> Please
> > >>>> advise.
> > >>>>
> > >>>> On Wed, Feb 12, 2025 at 6:36 AM PJ Fanning <fannin...@apache.org>
> > >>>> wrote:
> > >>>>
> > >>>>> I have some tests that appear to show that that test class I
> > >>>>> referenced earlier does handle formatted dates. I added an extra
> one
> > >>>>> today.
> > >>>>>
> > >>>>>
> > >>>>>
> https://github.com/pjfanning/excel-streaming-reader/commit/d26bdb3be3aac829a77ec6912bb63811c4812e06
> > >>>>>
> > >>>>> On Wed, 12 Feb 2025 at 05:27, Dan S <dsti...@gmail.com> wrote:
> > >>>>> >
> > >>>>> > Thank you very much for that example. It was very helpful.
> Based on
> > >>>>> the example,  I changed my code, changing back the cellStyle to
> true in the
> > >>>>> CellCopyPolicy and I refactored the method copyRows from
> > >>>>> org.apache.poi.xssf.usermodel.XSSFSheet and method copyRowFrom from
> > >>>>> org.apache.poi.xssf.usermodel.XSSFRow instantiating a
> CellCopyContext in
> > >>>>> copyRows. The problem though is when I ran the following worksheet
> through,
> > >>>>> the resulting workbook with a single sheet has date formatted
> columns (as
> > >>>>> expected) in the column with header transaction_date is but the
> values in
> > >>>>> the column header transaction time is not date formatted where it
> should be
> > >>>>> (as it is in the original file). Attached is my new code and the
> sample
> > >>>>> file. Please advise. Thank you so much!
> > >>>>> >
> > >>>>> > On Tue, Feb 11, 2025 at 4:18 PM PJ Fanning <fannin...@apache.org
> >
> > >>>>> wrote:
> > >>>>> >>
> > >>>>> >> That copyRows method doesn't allow you to specify the
> > >>>>> CellCopyContext.
> > >>>>> >> We probably should add a variant of copyRows that supports this
> > >>>>> extra
> > >>>>> >> param. The context keeps track of styles so that it avoids
> adding
> > >>>>> the
> > >>>>> >> same style over and over - i.e. it spots the duplicate styles.
> > >>>>> >>
> > >>>>> >> This example might be useful
> > >>>>> >>
> > >>>>>
> https://github.com/pjfanning/excel-streaming-reader/blob/main/src/test/java/com/github/pjfanning/xlsx/CopyToSXSSFUtil.java
> > >>>>> >>
> > >>>>> >> CellUtil.copyCell - this method can take a CellCopyContext
> instance.
> > >>>>> >>
> > >>>>> >> On Tue, 11 Feb 2025 at 17:48, Dan S <dsti...@gmail.com> wrote:
> > >>>>> >> >
> > >>>>> >> > I hope I am addressing the right list. I am an Apache NIFI
> > >>>>> developer and I
> > >>>>> >> > have developed an Apache NIFI processor named SplitExcel which
> > >>>>> splits a
> > >>>>> >> > multi sheet Microsoft Excel spreadsheet into multiple
> Microsoft
> > >>>>> Excel
> > >>>>> >> > spreadsheets where each sheet from the original file is
> converted
> > >>>>> to an
> > >>>>> >> > individual spreadsheet. Source code for this can be found here
> > >>>>> >> > <
> > >>>>>
> https://github.com/apache/nifi/blob/main/nifi-extension-bundles/nifi-poi-bundle/nifi-poi-services/src/main/java/org/apache/nifi/processors/excel/SplitExcel.java
> > >>>>> >
> > >>>>> >> > (lines
> > >>>>> >> > 165-188).
> > >>>>> >> > In order to accomplish this, I use the
> > >>>>> >> > com.github.pjfanning.xlsx.StreamingReader to stream the Excel
> > >>>>> workbook
> > >>>>> >> > which allows for easy looping over each sheet. I create a new
> > >>>>> XSSFWorkbook
> > >>>>> >> > for each sheet there is in the existing workbook and I use
> > >>>>> XSSFSheet
> > >>>>> >> > copyRows method to copy the rows from the existing sheet to a
> new
> > >>>>> sheet to
> > >>>>> >> > place in the new XSSFWorkbook. The CellCopyPolicy used has
> > >>>>> cellStyle set to
> > >>>>> >> > false in order to avoid exceeding the maximum number of cell
> > >>>>> styles (fixed
> > >>>>> >> > in NIFI-13726 <
> https://issues.apache.org/jira/browse/NIFI-13726>).
> > >>>>> The
> > >>>>> >> > drawback though of using cellStyle set to false is that any
> > >>>>> columns which
> > >>>>> >> > have dates, its date formatting is lost and what is left is a
> > >>>>> meaningless
> > >>>>> >> > number (reported in NIFI-14106
> > >>>>> >> > <https://issues.apache.org/jira/browse/NIFI-14106>). Is
> there a
> > >>>>> way I can
> > >>>>> >> > use copyRows not to exceed the maximum number of cell styles
> and
> > >>>>> yet have
> > >>>>> >> > the formatting needed for dates?
> > >>>>> >
> > >>>>> >
> > >>>>> >
> ---------------------------------------------------------------------
> > >>>>> > To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
> > >>>>> > For additional commands, e-mail: user-h...@poi.apache.org
> > >>>>>
> > >>>>>
> ---------------------------------------------------------------------
> > >>>>> To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
> > >>>>> For additional commands, e-mail: user-h...@poi.apache.org
> > >>>>>
> > >>>>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
> For additional commands, e-mail: user-h...@poi.apache.org
>
>

Reply via email to