[ANNOUNCE] Apache Commons Codec 1.16.0

2023-06-21 Thread Gary Gregory
The Apache Commons Team is pleased to announce Apache Commons Codec 1.16.0. The Apache Commons Codec package contains simple encoders and decoders for various formats such as Base64 and Hexadecimal. In addition to these widely used encoders and decoders, the codec package also maintains a collect

Re: [VOTE] Release Apache Commons Codec 1.16.0 based on RC2

2023-06-21 Thread Gary Gregory
This vote passes with the following binding +1 votes from: - Bruno Kinoshita - Rob Tompkins - Gary Gregory Gary On Wed, Jun 21, 2023 at 10:30 AM Gary Gregory wrote: > > My +1 > > Gary > > On Sat, Jun 17, 2023, 20:24 Gary Gregory wrote: >> >> We have fixed a few bugs and added some enhancements

Re: [CSV] Strategies to handle duplicate headers

2023-06-21 Thread sebb
On Tue, 20 Jun 2023 at 12:39, Gary Gregory wrote: > > Hi All, > > This thread is a follow-up to > https://github.com/apache/commons-csv/pull/309#issuecomment-1441456258 > > Bruno says: > "With Pandas it automatically deduplicates the column names. Maybe > that's a feature that we could have in Com

RE: Re: [CSV] Strategies to handle duplicate headers

2023-06-21 Thread Seth Falco
I don't have a strong enough opinion to conclude what's best. Giving it more thought, I think the interface approach I proposed is overcomplicated tbh. I can't imagine needing another duplicate header mode after this. However, I could imagine situations where we define DuplicateHeaderMode.DE

Re: [io] Possible addition of getFiles/getFileNames and Ant includes/excludes

2023-06-21 Thread Gary Gregory
I think you could achieve what you want today using the existing PathFilter hierarchy plus a custom Ant PathFilter. WDYT? Gary On Mon, Jun 19, 2023, 20:17 Gary Gregory wrote: > Hi Elliotte, > > On Mon, Jun 19, 2023 at 4:08 PM Elliotte Rusty Harold > wrote: > > > > I'm working on a longterm pr

Re: [VOTE] Release Apache Commons Codec 1.16.0 based on RC2

2023-06-21 Thread Gary Gregory
My +1 Gary On Sat, Jun 17, 2023, 20:24 Gary Gregory wrote: > We have fixed a few bugs and added some enhancements since Apache > Commons Codec 1.15 was released, so I would like to release Apache > Commons Codec 1.16.0. > > Apache Commons Codec 1.16.0 RC2 is available for review here: > htt

Re: [VOTE] Release Apache Commons Codec 1.16.0 based on RC2

2023-06-21 Thread Rob Tompkins
+1 Builds on java8, java11, java17 with maven 3.9.2., Reports all look good, Signatures all look good, Site looks good… Cheers, -Rob > On Jun 17, 2023, at 8:24 PM, Gary Gregory wrote: > > We have fixed a few bugs and added some enhancements since Apache > Commons Codec 1.15 was released, so I

Re: [CSV] Strategies to handle duplicate headers

2023-06-21 Thread Gary Gregory
Well, maybe we should not have a postfix string method, that assumes a lot. A default implementation of a function to convert all header names sounds better. Gary On Wed, Jun 21, 2023, 09:11 Gary Gregory wrote: > So it is starting to sound like we need either to add to CSVFormat: > > - "duplica

Re: [CSV] Strategies to handle duplicate headers

2023-06-21 Thread Gary Gregory
So it is starting to sound like we need either to add to CSVFormat: - "duplicate header postix string", or - deprecate duplicate header mode in favor of a duplicate header strategy which holds a duplicate header mode plus a duplicate header postfix string and some functional interface for custom p

Re: [CSV] Strategies to handle duplicate headers

2023-06-21 Thread David Dellsperger
I've always had a big concern with this kind of behavior, because what happens if the "new column" already exists but later in the header? It seems like python/pandas deals with this by incrementing AGAIN, so they read the header and THEN decide what to do with the values for duplicates (make sense