[Pharo-users] Re: NeoCSVReader and wrong number of fieldAccessors

2021-05-13 Thread Sven Van Caekenberghe
There is now the following commit: https://github.com/svenvc/NeoCSV/commit/0acc2270b382f52533c478f2f1585341e390d4b5 which should address a couple of issues. > On 22 Jan 2021, at 12:15, jtuc...@objektfabrik.de wrote: > > Tim, > > > > > Am 22.01.21 um 10:22 schrieb Tim Mackinnon: >> I’m not

[Pharo-users] Re: NeoCSVReader and wrong number of fieldAccessors

2021-01-22 Thread jtuc...@objektfabrik.de
Tim, Am 22.01.21 um 10:22 schrieb Tim Mackinnon: I’m not doing any CSV processing at the moment, but have in the past - so was interested in this thread. @Kasper, can’t you just use #readHeader upfront, and do the assertion yourself, and then proceed to loop through your records? It would

[Pharo-users] Re: NeoCSVReader and wrong number of fieldAccessors

2021-01-22 Thread Tim Mackinnon
I’m not doing any CSV processing at the moment, but have in the past - so was interested in this thread. @Kasper, can’t you just use #readHeader upfront, and do the assertion yourself, and then proceed to loop through your records? It would seem that the Neo caters for what you are suggesting

[Pharo-users] Re: NeoCSVReader and wrong number of fieldAccessors

2021-01-21 Thread jtuc...@objektfabrik.de
Kasper, I think this is somewhat close to another thing I am describing here: https://github.com/svenvc/NeoCSV/issues/20 The problem with extending NeoCSV endlessly is that some of the things we need with "real-life" CSV files is the fact that

[Pharo-users] Re: NeoCSVReader and wrong number of fieldAccessors

2021-01-21 Thread Kasper Osterbye
As it happened, I ran into the exact same scenario as Joachim just the other day, that is, the external provider of my csv had added some new columns. In my case manifested itself in an error that an integer field was not an integer (because new columns were added in the middle). Reading through

[Pharo-users] Re: NeoCSVReader and wrong number of fieldAccessors

2021-01-07 Thread Richard O'Keefe
Thank you very much. I converted your benchmark to my Smalltalk dialect and was pleased with the results. This gave me the impetus I needed to implement the #recordClass: feature of NeoCSVReader, although in my case it requires the class to implement #withAll: and the operand is a (reused)

[Pharo-users] Re: NeoCSVReader and wrong number of fieldAccessors

2021-01-07 Thread Sven Van Caekenberghe
> On 7 Jan 2021, at 07:15, Richard O'Keefe wrote: > > You aren't sure what point I was making? > How about the one I actually wrote down: > What test data was NeoCSV benchmarked with > and can I get my hands on it? > THAT is the point. The data points I showed (and > many others I have

[Pharo-users] Re: NeoCSVReader and wrong number of fieldAccessors

2021-01-06 Thread jtuc...@objektfabrik.de
Richard, Am 07.01.21 um 07:15 schrieb Richard O'Keefe: You aren't sure what point I was making? exactly, the thread you answered was about a possible bug in NeoCSV parser. Your post was about your doubts about the claim of efficiency on the parser's web site. So you threw in some

[Pharo-users] Re: NeoCSVReader and wrong number of fieldAccessors

2021-01-06 Thread Richard O'Keefe
You aren't sure what point I was making? How about the one I actually wrote down: What test data was NeoCSV benchmarked with and can I get my hands on it? THAT is the point. The data points I showed (and many others I have not) are not satisfactory to me. I have been searching for CSV test

[Pharo-users] Re: NeoCSVReader and wrong number of fieldAccessors

2021-01-06 Thread Sven Van Caekenberghe
Joachim, > On 6 Jan 2021, at 11:21, jtuc...@objektfabrik.de wrote: > > Hi Sven, > > > I must say I am really happy with your change. We get a nice exception > whenever the number of fieldAccessor doesn't match with the number of defined > fieldAccessors. So far it also seems the endless

[Pharo-users] Re: NeoCSVReader and wrong number of fieldAccessors

2021-01-06 Thread jtuc...@objektfabrik.de
Hi Sven, I must say I am really happy with your change. We get a nice exception whenever the number of fieldAccessor doesn't match with the number of defined fieldAccessors. So far it also seems the endless loops are gone as well. What a leap forward! I'm adding an issue on github about

[Pharo-users] Re: NeoCSVReader and wrong number of fieldAccessors

2021-01-06 Thread Sven Van Caekenberghe
Hi Richard, Benchmarking is a can of worms, many factors have to be considered. But the first requirement is obviously to be completely open over what you are doing and what you are comparing. NeoCSV contains a simple benchmark suite called NeoCSVBenchmark, which was used during development.

[Pharo-users] Re: NeoCSVReader and wrong number of fieldAccessors

2021-01-06 Thread jtuc...@objektfabrik.de
Richard, I am not sure what point you are trying to make here. You have something cooler and faster? Great, how about sharing? You could make a faster one when it doesn't convert numbers and stuff? Great. I guess the time will be spent after parsing in 95% of the use cases. It depends. And

[Pharo-users] Re: NeoCSVReader and wrong number of fieldAccessors

2021-01-06 Thread Stéphane Ducasse
Another point: In open-source and in this community. Either the code people mentioned is open-source and accessible or it does not exist. If it does not exist then this is easy :) S. > On 6 Jan 2021, at 05:10, Richard O'Keefe wrote: > > NeoCSVReader is described as efficient. What is that

[Pharo-users] Re: NeoCSVReader and wrong number of fieldAccessors

2021-01-06 Thread Stéphane Ducasse
John I’m sorry to tell you that but you cannot write mail like that without telling us where is the code of CVSParser. You cannot basically discredit the work on Sven without providing code to compare. S. > On 6 Jan 2021, at 05:10, Richard O'Keefe wrote: > > NeoCSVReader is described as

[Pharo-users] Re: NeoCSVReader and wrong number of fieldAccessors

2021-01-05 Thread Richard O'Keefe
NeoCSVReader is described as efficient. What is that in comparison to? What benchmark data are used? Here are benchmark results measured today. (5,000 data line file, 9,145,009 characters). methodtime(ms) Just read characters 410 CSVDecoder>>next 3415 astc's CSV

[Pharo-users] Re: NeoCSVReader and wrong number of fieldAccessors

2021-01-05 Thread jtuc...@objektfabrik.de
Sven, I tested your change with the file and filter (our own way of defining csv mappings by the end users) which used to send our application into an endless loop. And voila: we get an exception instead of a frozen image! I will give the conversion errors a test drive tomorrow. I am

[Pharo-users] Re: NeoCSVReader and wrong number of fieldAccessors

2021-01-05 Thread jtuc...@objektfabrik.de
Hi Sven, all I can say is: wow. I have no words. I will have to learn a bit about Pharo and github real quick now in order to try your changes Thank you very much. I'll give you feedback as fast as I can. (And forget my questions about #readAtEndOrEndOfLine. I somhow didn't understand

[Pharo-users] Re: NeoCSVReader and wrong number of fieldAccessors

2021-01-05 Thread Sven Van Caekenberghe
Hi Joachim, Have a look at the following commit: https://github.com/svenvc/NeoCSV/commit/a3d6258c28138fe3b15aa03ae71cf1e077096d39 and specifically the added unit tests. These should help clarify the new behaviour. If anything is not clear, please ask. HTH, Sven > On 5 Jan 2021, at 08:49,

[Pharo-users] Re: NeoCSVReader and wrong number of fieldAccessors

2021-01-04 Thread jtuc...@objektfabrik.de
Sven, first of all thanks a lot for taking your time with this! Your test case is so beautifully small I can't believe it ;-) While I think some kind of validation could help with parsing CSV, I remember reading your comment on this in some other discussion long ago. You wrote you don't see

[Pharo-users] Re: NeoCSVReader and wrong number of fieldAccessors

2021-01-04 Thread jtuc...@objektfabrik.de
Paul, thank you very much for this idea. Your suggestion is probably "good enough" to at least catch errors when the number of columns doesn't match in the whole file or the first row. For my use case, it wouldn't make any difference if the first row contains header information or not. There

[Pharo-users] Re: NeoCSVReader and wrong number of fieldAccessors

2021-01-04 Thread Sven Van Caekenberghe
Hi Joachim, Thanks for the detailed feedback. This is most helpful. I need to think more about this and experiment a bit. This is what I came up with in a Workspace/Playground: input := 'foo,1 bar,2 foobar,3'. (NeoCSVReader on: input readStream) upToEnd. (NeoCSVReader on: input readStream)

[Pharo-users] Re: NeoCSVReader and wrong number of fieldAccessors

2021-01-04 Thread Paul DeBruicker
After instantiating the reader and before doing the reading you can #readHeader and check that the reader field count and header field count match. Would that help? If the CSV doesn't use headers then you can process the "header" as the first record and then process the rest of the file.

[Pharo-users] Re: NeoCSVReader and wrong number of fieldAccessors

2021-01-04 Thread jtuc...@objektfabrik.de
Please find attached a small test case to demonstrate what I mean. There is just some nonsense Business Object class and a simple test case in this fileout. Am 04.01.21 um 14:36 schrieb jtuc...@objektfabrik.de: Happy new year to all of you! May 2021 be an increasingly less crazy year than