Re: [csv] Why does CSVFormat provide a validate() method instead of validating parameters passed to its constructor?
Am 16. März 2012 23:36 schrieb Emmanuel Bourg ebo...@apache.org: Choice is good I agree. Commons CSV will also support annotated POJO, that will give two ways to use the API. Are we still talking about how to create CSVFormat instances? Benedikt Emmanuel Bourg Le 16/03/2012 22:26, Simone Tripodi a écrit : Hi all, whatever name/pattern is called what we intend to apply, the result doesn't change :P Jokes a part, given the past experience of Digester3, as reported by Matt and James, I can suggest you to not limit to users the possibilities to chose their preferred approaches. Digester3 - which of course has a larger set of APIs - allows users configuring it with four APIs set: * plain old Digester2.X addRule() alike methods; * the RulesBinder fluent APIs; * Annotated POJOs, built on top of RulesBinder; * XML descriptors, built on top of RulesBinder. no restrictions - just provide the n API layers that users want/need on top of one in order to centralize the errors and make them satisfied (which is he most important side, IMHO). When developing Digester3, I wondered who would have used the xmlrules today: pooff, magically a users not only is using it, he's also contributing on making it betetr on supporting multi-thread environments. So, concluding: instead of choosing which approach has to be applied, just apply both as Seb is proposing. Just my 0.02 cents, -Simo http://people.apache.org/~simonetripodi/ http://simonetripodi.livejournal.com/ http://twitter.com/simonetripodi http://www.99soft.org/ On Fri, Mar 16, 2012 at 5:40 PM, James Carman jcar...@carmanconsulting.com wrote: Did I say they were the same? On Mar 16, 2012 12:22 PM, Emmanuel Bourgebo...@apache.org wrote: Le 16/03/2012 13:34, James Carman a écrit : +1 for builder pattern and fluent API Fluent API != Builder Pattern They are similar because they use method chaining, but that's not equivalent. http://martinfowler.com/bliki/**FluentInterface.htmlhttp://martinfowler.com/bliki/FluentInterface.html http://en.wikipedia.org/wiki/**Fluent_interfacehttp://en.wikipedia.org/wiki/Fluent_interface http://en.wikipedia.org/wiki/**Builder_patternhttp://en.wikipedia.org/wiki/Builder_pattern Emmanuel Bourg --**--**- To unsubscribe, e-mail: dev-unsubscribe@commons.**apache.orgdev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: [csv] Why does CSVFormat provide a validate() method instead of validating parameters passed to its constructor?
Hi Benedikt, I accept that, and I respect your decision. I don't want to argue with you on this, I just want to understand your decision. From Effective Java, I learned if you are facing a constructor with lots of optional arguments, consider using the builder pattern. Can you explain, why you think it is not appropriate in this case? You said it is too verbose, but it's just one additional call, compared with what we have now. You said it, it's an additional call. For [csv] I'd like to focus on a simple and user friendly API. Effective Java is a good reference, but I think we should find a balance between theorical by the book perfectness and user friendliness. The advantage of the builder (sorry now I'm arguing :) is, that nobody has to remember to validate the format. Even if validation is something that is package private, that could lead to the point where someone forgets to add that line. Now you could say we have unit tests for that. But isn't it the responsibility of an object to verify that it is being instantiated into a valid state? There was no validation before and nobody complained. I think no other CSV API validates its parameters. I thought it was harmless and convenient to add it, but now if it's used as an argument to make the API more verbose I'd rather remove it completely. Also we don't know yet, if there always will be only one package in the library. What if, we add another package (o.a.c.csv.beanmapping or what ever) and we want to use CSVFormat there. Then we would have to make validate public, exposing it to the outside world. Well, why not if that's necessary at this time. Emmanuel Bourg smime.p7s Description: S/MIME Cryptographic Signature
Re: [csv] Why does CSVFormat provide a validate() method instead of validating parameters passed to its constructor?
Have a look at the sample builder implementation in https://issues.apache.org/jira/browse/CSV-68 On 16 March 2012 09:45, Emmanuel Bourg ebo...@apache.org wrote: Hi Benedikt, I accept that, and I respect your decision. I don't want to argue with you on this, I just want to understand your decision. From Effective Java, I learned if you are facing a constructor with lots of optional arguments, consider using the builder pattern. Can you explain, why you think it is not appropriate in this case? You said it is too verbose, but it's just one additional call, compared with what we have now. You said it, it's an additional call. For [csv] I'd like to focus on a simple and user friendly API. Effective Java is a good reference, but I think we should find a balance between theorical by the book perfectness and user friendliness. The advantage of the builder (sorry now I'm arguing :) is, that nobody has to remember to validate the format. Even if validation is something that is package private, that could lead to the point where someone forgets to add that line. Now you could say we have unit tests for that. But isn't it the responsibility of an object to verify that it is being instantiated into a valid state? There was no validation before and nobody complained. I think no other CSV API validates its parameters. I thought it was harmless and convenient to add it, but now if it's used as an argument to make the API more verbose I'd rather remove it completely. Also we don't know yet, if there always will be only one package in the library. What if, we add another package (o.a.c.csv.beanmapping or what ever) and we want to use CSVFormat there. Then we would have to make validate public, exposing it to the outside world. Well, why not if that's necessary at this time. Emmanuel Bourg - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: [csv] Why does CSVFormat provide a validate() method instead of validating parameters passed to its constructor?
+1 for builder pattern and fluent API On Mar 16, 2012 4:24 AM, Benedikt Ritter benerit...@googlemail.com wrote: Am 15. März 2012 21:20 schrieb Emmanuel Bourg ebo...@apache.org: Le 15/03/2012 20:26, Benedikt Ritter a écrit : How about you Emmanuel? Could sebb convince you? ;-) How about this: I'll create a patch and attach it to JIRA. Then we'll have a better basis for discussion. Sorry but I'm not convinced. I see exactly where this leads. Hey Emmanuel, I accept that, and I respect your decision. I don't want to argue with you on this, I just want to understand your decision. From Effective Java, I learned if you are facing a constructor with lots of optional arguments, consider using the builder pattern. Can you explain, why you think it is not appropriate in this case? You said it is too verbose, but it's just one additional call, compared with what we have now. The advantage of the builder (sorry now I'm arguing :) is, that nobody has to remember to validate the format. Even if validation is something that is package private, that could lead to the point where someone forgets to add that line. Now you could say we have unit tests for that. But isn't it the responsibility of an object to verify that it is being instantiated into a valid state? Also we don't know yet, if there always will be only one package in the library. What if, we add another package (o.a.c.csv.beanmapping or what ever) and we want to use CSVFormat there. Then we would have to make validate public, exposing it to the outside world. If you have some free time and want to do something fun you can try reimplementing the parser with less array copies. Commons CSV is still behind the other APIs on the performance side. I'll have a look at that, and at what Ted suggested on the other thread. TIA for taking your time to explain!! Benedikt Emmanuel Bourg - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: [csv] Why does CSVFormat provide a validate() method instead of validating parameters passed to its constructor?
The votes show that several of us in the Commons community are leaning more and more toward fluent APIs these days. But rather than wasting time going back and forth arguing about it, why not build the fluent API separately from the rest of the package and leave the choice up to the user? I don't think I'm incorrect to say that this is more or less what [digester]3 does. Matt On Fri, Mar 16, 2012 at 7:34 AM, James Carman jcar...@carmanconsulting.com wrote: +1 for builder pattern and fluent API On Mar 16, 2012 4:24 AM, Benedikt Ritter benerit...@googlemail.com wrote: Am 15. März 2012 21:20 schrieb Emmanuel Bourg ebo...@apache.org: Le 15/03/2012 20:26, Benedikt Ritter a écrit : How about you Emmanuel? Could sebb convince you? ;-) How about this: I'll create a patch and attach it to JIRA. Then we'll have a better basis for discussion. Sorry but I'm not convinced. I see exactly where this leads. Hey Emmanuel, I accept that, and I respect your decision. I don't want to argue with you on this, I just want to understand your decision. From Effective Java, I learned if you are facing a constructor with lots of optional arguments, consider using the builder pattern. Can you explain, why you think it is not appropriate in this case? You said it is too verbose, but it's just one additional call, compared with what we have now. The advantage of the builder (sorry now I'm arguing :) is, that nobody has to remember to validate the format. Even if validation is something that is package private, that could lead to the point where someone forgets to add that line. Now you could say we have unit tests for that. But isn't it the responsibility of an object to verify that it is being instantiated into a valid state? Also we don't know yet, if there always will be only one package in the library. What if, we add another package (o.a.c.csv.beanmapping or what ever) and we want to use CSVFormat there. Then we would have to make validate public, exposing it to the outside world. If you have some free time and want to do something fun you can try reimplementing the parser with less array copies. Commons CSV is still behind the other APIs on the performance side. I'll have a look at that, and at what Ted suggested on the other thread. TIA for taking your time to explain!! Benedikt Emmanuel Bourg - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: [csv] Why does CSVFormat provide a validate() method instead of validating parameters passed to its constructor?
On 16 March 2012 14:53, Matt Benson gudnabr...@gmail.com wrote: The votes show that several of us in the Commons community are leaning more and more toward fluent APIs these days. But rather than wasting time going back and forth arguing about it, why not build the fluent API separately from the rest of the package and leave the choice up to the user? I don't think I'm incorrect to say that this is more or less what [digester]3 does. My proposed builder does not change the current code at all. However ideally it should be defined in the same class so as to keep the constructor private (and it will need access to a private constant). Note that the original methods are also fluent in the sense that they can be chained together. The only difference is that the proposed fluent interface requires a starting method (or it could be a ctor) and a build() method call at the end. If required, both sets of methods could be maintained going forward, but that seems like an unnecessary maintenance overhead. CSV has yet to be released, so there's no need for backwards compatibilty and therefore we don't _need_ to keep the existing methods. Matt On Fri, Mar 16, 2012 at 7:34 AM, James Carman jcar...@carmanconsulting.com wrote: +1 for builder pattern and fluent API On Mar 16, 2012 4:24 AM, Benedikt Ritter benerit...@googlemail.com wrote: Am 15. März 2012 21:20 schrieb Emmanuel Bourg ebo...@apache.org: Le 15/03/2012 20:26, Benedikt Ritter a écrit : How about you Emmanuel? Could sebb convince you? ;-) How about this: I'll create a patch and attach it to JIRA. Then we'll have a better basis for discussion. Sorry but I'm not convinced. I see exactly where this leads. Hey Emmanuel, I accept that, and I respect your decision. I don't want to argue with you on this, I just want to understand your decision. From Effective Java, I learned if you are facing a constructor with lots of optional arguments, consider using the builder pattern. Can you explain, why you think it is not appropriate in this case? You said it is too verbose, but it's just one additional call, compared with what we have now. The advantage of the builder (sorry now I'm arguing :) is, that nobody has to remember to validate the format. Even if validation is something that is package private, that could lead to the point where someone forgets to add that line. Now you could say we have unit tests for that. But isn't it the responsibility of an object to verify that it is being instantiated into a valid state? Also we don't know yet, if there always will be only one package in the library. What if, we add another package (o.a.c.csv.beanmapping or what ever) and we want to use CSVFormat there. Then we would have to make validate public, exposing it to the outside world. If you have some free time and want to do something fun you can try reimplementing the parser with less array copies. Commons CSV is still behind the other APIs on the performance side. I'll have a look at that, and at what Ted suggested on the other thread. TIA for taking your time to explain!! Benedikt Emmanuel Bourg - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: [csv] Why does CSVFormat provide a validate() method instead of validating parameters passed to its constructor?
Yes this reminds me very much of the digester3 scenario. Let's start coding and quit talking! As sebb pointed out, this is unreleased (at least from our pmc) code. Let's try on a few of the alternatives and see which one looks/feels better (does this API make my butt look big?). On Mar 16, 2012 10:53 AM, Matt Benson gudnabr...@gmail.com wrote: The votes show that several of us in the Commons community are leaning more and more toward fluent APIs these days. But rather than wasting time going back and forth arguing about it, why not build the fluent API separately from the rest of the package and leave the choice up to the user? I don't think I'm incorrect to say that this is more or less what [digester]3 does. Matt On Fri, Mar 16, 2012 at 7:34 AM, James Carman jcar...@carmanconsulting.com wrote: +1 for builder pattern and fluent API On Mar 16, 2012 4:24 AM, Benedikt Ritter benerit...@googlemail.com wrote: Am 15. März 2012 21:20 schrieb Emmanuel Bourg ebo...@apache.org: Le 15/03/2012 20:26, Benedikt Ritter a écrit : How about you Emmanuel? Could sebb convince you? ;-) How about this: I'll create a patch and attach it to JIRA. Then we'll have a better basis for discussion. Sorry but I'm not convinced. I see exactly where this leads. Hey Emmanuel, I accept that, and I respect your decision. I don't want to argue with you on this, I just want to understand your decision. From Effective Java, I learned if you are facing a constructor with lots of optional arguments, consider using the builder pattern. Can you explain, why you think it is not appropriate in this case? You said it is too verbose, but it's just one additional call, compared with what we have now. The advantage of the builder (sorry now I'm arguing :) is, that nobody has to remember to validate the format. Even if validation is something that is package private, that could lead to the point where someone forgets to add that line. Now you could say we have unit tests for that. But isn't it the responsibility of an object to verify that it is being instantiated into a valid state? Also we don't know yet, if there always will be only one package in the library. What if, we add another package (o.a.c.csv.beanmapping or what ever) and we want to use CSVFormat there. Then we would have to make validate public, exposing it to the outside world. If you have some free time and want to do something fun you can try reimplementing the parser with less array copies. Commons CSV is still behind the other APIs on the performance side. I'll have a look at that, and at what Ted suggested on the other thread. TIA for taking your time to explain!! Benedikt Emmanuel Bourg - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: [csv] Why does CSVFormat provide a validate() method instead of validating parameters passed to its constructor?
Am 14. März 2012 22:47 schrieb sebb seb...@gmail.com: On 14 March 2012 21:40, Benedikt Ritter benerit...@googlemail.com wrote: Am 14. März 2012 22:33 schrieb Emmanuel Bourg ebo...@apache.org: Le 14/03/2012 22:25, Benedikt Ritter a écrit : I agree with you on this. However, I think it would be better to tie validation to the object creation. Maybe the Builder Pattern like shown in Effective Java p. 14-15 is a reasonable solution for this case? It would be a bit more verbose, but we can be sure that everything will be validated. That's too verbose, please let's keep this simple API. okay! although, I don't find CSVFormat format = CSVFormat.defaults().withDelimiter('#').withCommentStart('/').build() too verbose ;) Agree entirely. And parse and format could perform an implicit build(). It would also allow one to eliminate the additional instance creation. How about you Emmanuel? Could sebb convince you? ;-) How about this: I'll create a patch and attach it to JIRA. Then we'll have a better basis for discussion. Bonne nuit! Gute Nacht ? Benedikt Emmanuel Bourg - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: [csv] Why does CSVFormat provide a validate() method instead of validating parameters passed to its constructor?
Le 15/03/2012 20:26, Benedikt Ritter a écrit : How about you Emmanuel? Could sebb convince you? ;-) How about this: I'll create a patch and attach it to JIRA. Then we'll have a better basis for discussion. Sorry but I'm not convinced. I see exactly where this leads. If you have some free time and want to do something fun you can try reimplementing the parser with less array copies. Commons CSV is still behind the other APIs on the performance side. Emmanuel Bourg smime.p7s Description: S/MIME Cryptographic Signature
Re: [csv] Why does CSVFormat provide a validate() method instead of validating parameters passed to its constructor?
Le 14/03/2012 21:52, Benedikt Ritter a écrit : the subject of this mail is pretty self-explanatory. Why do we need a package private validate() method, given the fact, that users can not create custom instances (constructor is package private)? You could even argue, that no validation is needed at all, since we are in control of what formats can be created. However I'd say, the validation makes sure that we do not unintentionally create invalid formats. But then validate() can be made private and called at the end of the constructor. Because the format could temporarily be in an invalid state. Something like this would break (a bit far fetched, I agree): CSVFormat.DEFAULT.withDelimiter('#').withCommentStart('/'); Emmanuel Bourg smime.p7s Description: S/MIME Cryptographic Signature
Re: [csv] Why does CSVFormat provide a validate() method instead of validating parameters passed to its constructor?
Am 14. März 2012 22:02 schrieb Emmanuel Bourg ebo...@apache.org: Le 14/03/2012 21:52, Benedikt Ritter a écrit : the subject of this mail is pretty self-explanatory. Why do we need a package private validate() method, given the fact, that users can not create custom instances (constructor is package private)? You could even argue, that no validation is needed at all, since we are in control of what formats can be created. However I'd say, the validation makes sure that we do not unintentionally create invalid formats. But then validate() can be made private and called at the end of the constructor. Because the format could temporarily be in an invalid state. Something like this would break (a bit far fetched, I agree): CSVFormat.DEFAULT.withDelimiter('#').withCommentStart('/'); okay, I understand. But doesn't that make things worse? Users can create invalid formats, but they can not call validate(), because it is package private. Emmanuel Bourg - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: [csv] Why does CSVFormat provide a validate() method instead of validating parameters passed to its constructor?
On 14 March 2012 21:02, Emmanuel Bourg ebo...@apache.org wrote: Le 14/03/2012 21:52, Benedikt Ritter a écrit : the subject of this mail is pretty self-explanatory. Why do we need a package private validate() method, given the fact, that users can not create custom instances (constructor is package private)? You could even argue, that no validation is needed at all, since we are in control of what formats can be created. However I'd say, the validation makes sure that we do not unintentionally create invalid formats. But then validate() can be made private and called at the end of the constructor. Because the format could temporarily be in an invalid state. Something like this would break (a bit far fetched, I agree): CSVFormat.DEFAULT.withDelimiter('#').withCommentStart('/'); In which case, reversing the order would work. We would just have to document this as a restriction. The problem with leaving the validation until later is that it decouples the cause and effect. This makes it a bit harder to debug. It's also simpler if there is a single validation method. At present some of the setters also perform validation. BTW, we should probably reject delimiter == DISABLED. Also, the DISABLED constant needs to be public (or available via a public getter) otherwise it's not possible to disable all but the delimiter. Using a getter would allow the constant to be changed if it became necessary. Emmanuel Bourg - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: [csv] Why does CSVFormat provide a validate() method instead of validating parameters passed to its constructor?
Am 14. März 2012 22:16 schrieb sebb seb...@gmail.com: On 14 March 2012 21:02, Emmanuel Bourg ebo...@apache.org wrote: Le 14/03/2012 21:52, Benedikt Ritter a écrit : the subject of this mail is pretty self-explanatory. Why do we need a package private validate() method, given the fact, that users can not create custom instances (constructor is package private)? You could even argue, that no validation is needed at all, since we are in control of what formats can be created. However I'd say, the validation makes sure that we do not unintentionally create invalid formats. But then validate() can be made private and called at the end of the constructor. Because the format could temporarily be in an invalid state. Something like this would break (a bit far fetched, I agree): CSVFormat.DEFAULT.withDelimiter('#').withCommentStart('/'); In which case, reversing the order would work. We would just have to document this as a restriction. The problem with leaving the validation until later is that it decouples the cause and effect. This makes it a bit harder to debug. It's also simpler if there is a single validation method. At present some of the setters also perform validation. I agree with you on this. However, I think it would be better to tie validation to the object creation. Maybe the Builder Pattern like shown in Effective Java p. 14-15 is a reasonable solution for this case? It would be a bit more verbose, but we can be sure that everything will be validated. BTW, we should probably reject delimiter == DISABLED. Also, the DISABLED constant needs to be public (or available via a public getter) otherwise it's not possible to disable all but the delimiter. Using a getter would allow the constant to be changed if it became necessary. Only if we remove Serializable... Emmanuel Bourg - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: [csv] Why does CSVFormat provide a validate() method instead of validating parameters passed to its constructor?
Le 14/03/2012 22:15, Benedikt Ritter a écrit : okay, I understand. But doesn't that make things worse? Users can create invalid formats, but they can not call validate(), because it is package private. Worse? There was no validation originally, users could do any kind of absurd things. I don't think this is an issue, the format will be validated one line later when it's used for parsing or printing. Emmanuel Bourg smime.p7s Description: S/MIME Cryptographic Signature
Re: [csv] Why does CSVFormat provide a validate() method instead of validating parameters passed to its constructor?
Le 14/03/2012 22:25, Benedikt Ritter a écrit : I agree with you on this. However, I think it would be better to tie validation to the object creation. Maybe the Builder Pattern like shown in Effective Java p. 14-15 is a reasonable solution for this case? It would be a bit more verbose, but we can be sure that everything will be validated. That's too verbose, please let's keep this simple API. Emmanuel Bourg smime.p7s Description: S/MIME Cryptographic Signature
Re: [csv] Why does CSVFormat provide a validate() method instead of validating parameters passed to its constructor?
Am 14. März 2012 22:33 schrieb Emmanuel Bourg ebo...@apache.org: Le 14/03/2012 22:25, Benedikt Ritter a écrit : I agree with you on this. However, I think it would be better to tie validation to the object creation. Maybe the Builder Pattern like shown in Effective Java p. 14-15 is a reasonable solution for this case? It would be a bit more verbose, but we can be sure that everything will be validated. That's too verbose, please let's keep this simple API. okay! although, I don't find CSVFormat format = CSVFormat.defaults().withDelimiter('#').withCommentStart('/').build() too verbose ;) Bonne nuit! Benedikt Emmanuel Bourg - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: [csv] Why does CSVFormat provide a validate() method instead of validating parameters passed to its constructor?
Le 14/03/2012 22:16, sebb a écrit : In which case, reversing the order would work. We would just have to document this as a restriction. Too much burden on the user. The problem with leaving the validation until later is that it decouples the cause and effect. This makes it a bit harder to debug. It'll break one line bellow the creation of the format in 99.99% of the cases, and the exception thrown is very explicit. It's also simpler if there is a single validation method. At present some of the setters also perform validation. I don't see this as an issue. BTW, we should probably reject delimiter == DISABLED. Ok Also, the DISABLED constant needs to be public (or available via a public getter) otherwise it's not possible to disable all but the delimiter. I'd better use a null value to disable a feature than exposing the DISABLED constant. Emmanuel Bourg smime.p7s Description: S/MIME Cryptographic Signature
Re: [csv] Why does CSVFormat provide a validate() method instead of validating parameters passed to its constructor?
On 14 March 2012 21:41, Emmanuel Bourg ebo...@apache.org wrote: Le 14/03/2012 22:16, sebb a écrit : In which case, reversing the order would work. We would just have to document this as a restriction. Too much burden on the user. The problem with leaving the validation until later is that it decouples the cause and effect. This makes it a bit harder to debug. It'll break one line bellow the creation of the format in 99.99% of the cases, and the exception thrown is very explicit. It's also simpler if there is a single validation method. At present some of the setters also perform validation. I don't see this as an issue. BTW, we should probably reject delimiter == DISABLED. Ok Also, the DISABLED constant needs to be public (or available via a public getter) otherwise it's not possible to disable all but the delimiter. I'd better use a null value to disable a feature than exposing the DISABLED constant. It's not possible to use null for a char value. Emmanuel Bourg - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: [csv] Why does CSVFormat provide a validate() method instead of validating parameters passed to its constructor?
On 14 March 2012 21:40, Benedikt Ritter benerit...@googlemail.com wrote: Am 14. März 2012 22:33 schrieb Emmanuel Bourg ebo...@apache.org: Le 14/03/2012 22:25, Benedikt Ritter a écrit : I agree with you on this. However, I think it would be better to tie validation to the object creation. Maybe the Builder Pattern like shown in Effective Java p. 14-15 is a reasonable solution for this case? It would be a bit more verbose, but we can be sure that everything will be validated. That's too verbose, please let's keep this simple API. okay! although, I don't find CSVFormat format = CSVFormat.defaults().withDelimiter('#').withCommentStart('/').build() too verbose ;) Agree entirely. And parse and format could perform an implicit build(). It would also allow one to eliminate the additional instance creation. Bonne nuit! Gute Nacht ? Benedikt Emmanuel Bourg - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: [csv] Why does CSVFormat provide a validate() method instead of validating parameters passed to its constructor?
Le 14/03/2012 22:44, sebb a écrit : It's not possible to use null for a char value. It assumed we changed the signature of the methods with java.lang.Character, while retaining the primitive internally. But that's not possible due to the getters. If possible I'd like to not expose this constant. The idea was that we start with disabled features by default in the predefined formats, and then we enable what's necessary. Thus we never have to disable anything. Emmanuel Bourg smime.p7s Description: S/MIME Cryptographic Signature
Re: [csv] Why does CSVFormat provide a validate() method instead of validating parameters passed to its constructor?
On 14 March 2012 21:56, Emmanuel Bourg ebo...@apache.org wrote: Le 14/03/2012 22:44, sebb a écrit : It's not possible to use null for a char value. It assumed we changed the signature of the methods with java.lang.Character, while retaining the primitive internally. But that's not possible due to the getters. I've just realised that the DISABLED value is actually exposed - just use a format with a disabled character and fetch it. It's not too late to change to using Character. If possible I'd like to not expose this constant. The idea was that we start with disabled features by default in the predefined formats, and then we enable what's necessary. Thus we never have to disable anything. It's not possible currently to create a format with encapsulator, commentStart and escape all null, except by knowing the value of DISABLED. Maybe that's not allowed, but it's not rejected by validation, and can be created by using '\ufffe' directly. Likewise, encapsulator and escape are not mutually exclusive. Should they be? Emmanuel Bourg - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: [csv] Why does CSVFormat provide a validate() method instead of validating parameters passed to its constructor?
Le 14/03/2012 23:35, sebb a écrit : It's not too late to change to using Character. The issue is the parser, this will probably degrade the performance. Unless the fields are made package private and accessed directly by the parser. It's not possible currently to create a format with encapsulator, commentStart and escape all null, except by knowing the value of DISABLED. The solution might be to introduce a quoting mode to the format. I planned to add this for the printer, but it can be useful for the parser too. This mode would have 3 states: - NEVER: Don't use quotes, even if the encapsulator is defined - ALWAYS: Always enclose values into quotes - REQUIRED: Enclose the values only if necessary Thus the quotes could be disabled with: CSVFormat.DEFAULT.withEncapsulation(NEVER) Likewise, encapsulator and escape are not mutually exclusive. Should they be? I wish it was, but MySQL actually produces files with quotes and escaped characters (delimiter and line feeds). I'm reviewing other RDBMS to see what formats we can expect in the wild. Emmanuel Bourg smime.p7s Description: S/MIME Cryptographic Signature
Re: [csv] Why does CSVFormat provide a validate() method instead of validating parameters passed to its constructor?
On 14 March 2012 22:54, Emmanuel Bourg ebo...@apache.org wrote: Le 14/03/2012 23:35, sebb a écrit : It's not too late to change to using Character. The issue is the parser, this will probably degrade the performance. Unless the fields are made package private and accessed directly by the parser. I don't see why, so long as you fetch the values once at the start. There's a theoretical problem with using a valid char value as a disabled indicator, at least with the parser as it stands. It assumes that the disabled char cannot occur in a file; that is not strictly true, so it could detect an escape where there is none. The example string pu\\ufffeblic is parsed as puBELlic when using CSVFormat.TDF.withUnicodeEscapesInterpreted(true) - i.e. the disabled char is treated as an escape, and \b = BEL. This could be avoided by checking isEscaping in the parser; similarly for the other chars that can be disabled. It's not possible currently to create a format with encapsulator, commentStart and escape all null, except by knowing the value of DISABLED. The solution might be to introduce a quoting mode to the format. I planned to add this for the printer, but it can be useful for the parser too. This mode would have 3 states: - NEVER: Don't use quotes, even if the encapsulator is defined I think it would be confusing to have a format with an encapsulator that is not used. - ALWAYS: Always enclose values into quotes - REQUIRED: Enclose the values only if necessary Thus the quotes could be disabled with: CSVFormat.DEFAULT.withEncapsulation(NEVER) Or provide a constant format with all 3 disabled. But it would still be simpler to be able to override each and every char independently; using Character is going to be the simplest way to achieve that. Probably still need to make output encapsulation switchable, but that's a different matter. Likewise, encapsulator and escape are not mutually exclusive. Should they be? I wish it was, but MySQL actually produces files with quotes and escaped characters (delimiter and line feeds). I'm reviewing other RDBMS to see what formats we can expect in the wild. OK. Emmanuel Bourg - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: [csv] Why does CSVFormat provide a validate() method instead of validating parameters passed to its constructor?
Le 15/03/2012 00:32, sebb a écrit : I don't see why, so long as you fetch the values once at the start. I tried that and the parser was slower. Don't ask me why. There's a theoretical problem with using a valid char value as a disabled indicator, at least with the parser as it stands. It assumes that the disabled char cannot occur in a file; that is not strictly true, so it could detect an escape where there is none. The example string pu\\ufffeblic is parsed as puBELlic when using CSVFormat.TDF.withUnicodeEscapesInterpreted(true) - i.e. the disabled char is treated as an escape, and \b =BEL. This could be avoided by checking isEscaping in the parser; similarly for the other chars that can be disabled. Good point, I didn't see this one. It can also be avoided by filtering invalid sequences in UnicodeUnescapeReader (or \ufffe specifically). Character.isDefined(char) will tell us if the character exists. U+FFFE doesn't: http://www.fileformat.info/info/unicode/char/fffe Emmanuel Bourg smime.p7s Description: S/MIME Cryptographic Signature