Re: [csv] Why does CSVFormat provide a validate() method instead of validating parameters passed to its constructor?

2012-03-17 Thread Benedikt Ritter
Am 16. März 2012 23:36 schrieb Emmanuel Bourg ebo...@apache.org:
 Choice is good I agree. Commons CSV will also support annotated POJO, that
 will give two ways to use the API.


Are we still talking about how to create CSVFormat instances?

Benedikt

 Emmanuel Bourg


 Le 16/03/2012 22:26, Simone Tripodi a écrit :

 Hi all,

 whatever name/pattern is called what we intend to apply, the result
 doesn't change :P

 Jokes a part, given the past experience of Digester3, as reported by
 Matt and James, I can suggest you to not limit to users the
 possibilities to chose their preferred approaches.

 Digester3 - which of course has a larger set of APIs - allows users
 configuring it with four APIs set:

  * plain old Digester2.X addRule() alike methods;
  * the RulesBinder fluent APIs;
  * Annotated POJOs, built on top of RulesBinder;
  * XML descriptors, built on top of RulesBinder.

 no restrictions - just provide the n API layers that users want/need
 on top of one in order to centralize the errors and make them
 satisfied (which is he most important side, IMHO).
 When developing Digester3, I wondered who would have used the xmlrules
 today: pooff, magically a users not only is using it, he's also
 contributing on making it betetr on supporting multi-thread
 environments.

 So, concluding: instead of choosing which approach has to be applied,
 just apply both as Seb is proposing.
 Just my 0.02 cents,
 -Simo

 http://people.apache.org/~simonetripodi/
 http://simonetripodi.livejournal.com/
 http://twitter.com/simonetripodi
 http://www.99soft.org/



 On Fri, Mar 16, 2012 at 5:40 PM, James Carman
 jcar...@carmanconsulting.com  wrote:

 Did I say they were the same?
 On Mar 16, 2012 12:22 PM, Emmanuel Bourgebo...@apache.org  wrote:

 Le 16/03/2012 13:34, James Carman a écrit :

 +1 for builder pattern and fluent API


 Fluent API != Builder Pattern

 They are similar because they use method chaining, but that's not
 equivalent.


 http://martinfowler.com/bliki/**FluentInterface.htmlhttp://martinfowler.com/bliki/FluentInterface.html


 http://en.wikipedia.org/wiki/**Fluent_interfacehttp://en.wikipedia.org/wiki/Fluent_interface


 http://en.wikipedia.org/wiki/**Builder_patternhttp://en.wikipedia.org/wiki/Builder_pattern


 Emmanuel Bourg


 --**--**-
 To unsubscribe, e-mail:
 dev-unsubscribe@commons.**apache.orgdev-unsubscr...@commons.apache.org
 For additional commands, e-mail: dev-h...@commons.apache.org



 -
 To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
 For additional commands, e-mail: dev-h...@commons.apache.org



 -
 To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
 For additional commands, e-mail: dev-h...@commons.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [csv] Why does CSVFormat provide a validate() method instead of validating parameters passed to its constructor?

2012-03-16 Thread Emmanuel Bourg

Hi Benedikt,


I accept that, and I respect your decision. I don't want to argue with
you on this, I just want to understand your decision. From Effective
Java, I learned if you are facing a constructor with lots of optional
arguments, consider using the builder pattern. Can you explain, why
you think it is not appropriate in this case? You said it is too
verbose, but it's just one additional call, compared with what we have
now.


You said it, it's an additional call. For [csv] I'd like to focus on a 
simple and user friendly API. Effective Java is a good reference, but 
I think we should find a balance between theorical by the book 
perfectness and user friendliness.




The advantage of the builder (sorry now I'm arguing :) is, that nobody
has to remember to validate the format. Even if validation is
something that is package private, that could lead to the point where
someone forgets to add that line. Now you could say we have unit tests
for that. But isn't it the responsibility of an object to verify that
it is being instantiated into a valid state?


There was no validation before and nobody complained. I think no other 
CSV API validates its parameters. I thought it was harmless and 
convenient to add it, but now if it's used as an argument to make the 
API more verbose I'd rather remove it completely.




Also we don't know yet, if there always will be only one package in
the library. What if, we add another package (o.a.c.csv.beanmapping or
what ever) and we want to use CSVFormat there. Then we would have to
make validate public, exposing it to the outside world.


Well, why not if that's necessary at this time.

Emmanuel Bourg



smime.p7s
Description: S/MIME Cryptographic Signature


Re: [csv] Why does CSVFormat provide a validate() method instead of validating parameters passed to its constructor?

2012-03-16 Thread sebb
Have a look at the sample builder implementation in

https://issues.apache.org/jira/browse/CSV-68

On 16 March 2012 09:45, Emmanuel Bourg ebo...@apache.org wrote:
 Hi Benedikt,


 I accept that, and I respect your decision. I don't want to argue with
 you on this, I just want to understand your decision. From Effective
 Java, I learned if you are facing a constructor with lots of optional
 arguments, consider using the builder pattern. Can you explain, why
 you think it is not appropriate in this case? You said it is too
 verbose, but it's just one additional call, compared with what we have
 now.


 You said it, it's an additional call. For [csv] I'd like to focus on a
 simple and user friendly API. Effective Java is a good reference, but I
 think we should find a balance between theorical by the book perfectness
 and user friendliness.



 The advantage of the builder (sorry now I'm arguing :) is, that nobody
 has to remember to validate the format. Even if validation is
 something that is package private, that could lead to the point where
 someone forgets to add that line. Now you could say we have unit tests
 for that. But isn't it the responsibility of an object to verify that
 it is being instantiated into a valid state?


 There was no validation before and nobody complained. I think no other CSV
 API validates its parameters. I thought it was harmless and convenient to
 add it, but now if it's used as an argument to make the API more verbose I'd
 rather remove it completely.



 Also we don't know yet, if there always will be only one package in
 the library. What if, we add another package (o.a.c.csv.beanmapping or
 what ever) and we want to use CSVFormat there. Then we would have to
 make validate public, exposing it to the outside world.


 Well, why not if that's necessary at this time.

 Emmanuel Bourg


-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [csv] Why does CSVFormat provide a validate() method instead of validating parameters passed to its constructor?

2012-03-16 Thread James Carman
+1 for builder pattern and fluent API
On Mar 16, 2012 4:24 AM, Benedikt Ritter benerit...@googlemail.com
wrote:

 Am 15. März 2012 21:20 schrieb Emmanuel Bourg ebo...@apache.org:
  Le 15/03/2012 20:26, Benedikt Ritter a écrit :
 
 
  How about you Emmanuel? Could sebb convince you? ;-) How about this:
  I'll create a patch and attach it to JIRA. Then we'll have a better
  basis for discussion.
 
 
  Sorry but I'm not convinced. I see exactly where this leads.
 

 Hey Emmanuel,

 I accept that, and I respect your decision. I don't want to argue with
 you on this, I just want to understand your decision. From Effective
 Java, I learned if you are facing a constructor with lots of optional
 arguments, consider using the builder pattern. Can you explain, why
 you think it is not appropriate in this case? You said it is too
 verbose, but it's just one additional call, compared with what we have
 now.

 The advantage of the builder (sorry now I'm arguing :) is, that nobody
 has to remember to validate the format. Even if validation is
 something that is package private, that could lead to the point where
 someone forgets to add that line. Now you could say we have unit tests
 for that. But isn't it the responsibility of an object to verify that
 it is being instantiated into a valid state?
 Also we don't know yet, if there always will be only one package in
 the library. What if, we add another package (o.a.c.csv.beanmapping or
 what ever) and we want to use CSVFormat there. Then we would have to
 make validate public, exposing it to the outside world.

  If you have some free time and want to do something fun you can try
  reimplementing the parser with less array copies. Commons CSV is still
  behind the other APIs on the performance side.
 

 I'll have a look at that, and at what Ted suggested on the other thread.

 TIA for taking your time to explain!!
 Benedikt

  Emmanuel Bourg
 

 -
 To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
 For additional commands, e-mail: dev-h...@commons.apache.org




Re: [csv] Why does CSVFormat provide a validate() method instead of validating parameters passed to its constructor?

2012-03-16 Thread Matt Benson
The votes show that several of us in the Commons community are leaning
more and more toward fluent APIs these days.  But rather than wasting
time going back and forth arguing about it, why not build the fluent
API separately from the rest of the package and leave the choice up to
the user?  I don't think I'm incorrect to say that this is more or
less what [digester]3 does.

Matt

On Fri, Mar 16, 2012 at 7:34 AM, James Carman
jcar...@carmanconsulting.com wrote:
 +1 for builder pattern and fluent API
 On Mar 16, 2012 4:24 AM, Benedikt Ritter benerit...@googlemail.com
 wrote:

 Am 15. März 2012 21:20 schrieb Emmanuel Bourg ebo...@apache.org:
  Le 15/03/2012 20:26, Benedikt Ritter a écrit :
 
 
  How about you Emmanuel? Could sebb convince you? ;-) How about this:
  I'll create a patch and attach it to JIRA. Then we'll have a better
  basis for discussion.
 
 
  Sorry but I'm not convinced. I see exactly where this leads.
 

 Hey Emmanuel,

 I accept that, and I respect your decision. I don't want to argue with
 you on this, I just want to understand your decision. From Effective
 Java, I learned if you are facing a constructor with lots of optional
 arguments, consider using the builder pattern. Can you explain, why
 you think it is not appropriate in this case? You said it is too
 verbose, but it's just one additional call, compared with what we have
 now.

 The advantage of the builder (sorry now I'm arguing :) is, that nobody
 has to remember to validate the format. Even if validation is
 something that is package private, that could lead to the point where
 someone forgets to add that line. Now you could say we have unit tests
 for that. But isn't it the responsibility of an object to verify that
 it is being instantiated into a valid state?
 Also we don't know yet, if there always will be only one package in
 the library. What if, we add another package (o.a.c.csv.beanmapping or
 what ever) and we want to use CSVFormat there. Then we would have to
 make validate public, exposing it to the outside world.

  If you have some free time and want to do something fun you can try
  reimplementing the parser with less array copies. Commons CSV is still
  behind the other APIs on the performance side.
 

 I'll have a look at that, and at what Ted suggested on the other thread.

 TIA for taking your time to explain!!
 Benedikt

  Emmanuel Bourg
 

 -
 To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
 For additional commands, e-mail: dev-h...@commons.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [csv] Why does CSVFormat provide a validate() method instead of validating parameters passed to its constructor?

2012-03-16 Thread sebb
On 16 March 2012 14:53, Matt Benson gudnabr...@gmail.com wrote:
 The votes show that several of us in the Commons community are leaning
 more and more toward fluent APIs these days.  But rather than wasting
 time going back and forth arguing about it, why not build the fluent
 API separately from the rest of the package and leave the choice up to
 the user?  I don't think I'm incorrect to say that this is more or
 less what [digester]3 does.

My proposed builder does not change the current code at all.
However ideally it should be defined in the same class so as to keep
the constructor private (and it will need access to a private
constant).

Note that the original methods are also fluent in the sense that they
can be chained together.
The only difference is that the proposed fluent interface requires a
starting method (or it could be a ctor) and a build() method call at
the end.

If required, both sets of methods could be maintained going forward,
but that seems like an unnecessary maintenance overhead.

CSV has yet to be released, so there's no need for backwards
compatibilty and therefore we don't _need_ to keep the existing
methods.

 Matt

 On Fri, Mar 16, 2012 at 7:34 AM, James Carman
 jcar...@carmanconsulting.com wrote:
 +1 for builder pattern and fluent API
 On Mar 16, 2012 4:24 AM, Benedikt Ritter benerit...@googlemail.com
 wrote:

 Am 15. März 2012 21:20 schrieb Emmanuel Bourg ebo...@apache.org:
  Le 15/03/2012 20:26, Benedikt Ritter a écrit :
 
 
  How about you Emmanuel? Could sebb convince you? ;-) How about this:
  I'll create a patch and attach it to JIRA. Then we'll have a better
  basis for discussion.
 
 
  Sorry but I'm not convinced. I see exactly where this leads.
 

 Hey Emmanuel,

 I accept that, and I respect your decision. I don't want to argue with
 you on this, I just want to understand your decision. From Effective
 Java, I learned if you are facing a constructor with lots of optional
 arguments, consider using the builder pattern. Can you explain, why
 you think it is not appropriate in this case? You said it is too
 verbose, but it's just one additional call, compared with what we have
 now.

 The advantage of the builder (sorry now I'm arguing :) is, that nobody
 has to remember to validate the format. Even if validation is
 something that is package private, that could lead to the point where
 someone forgets to add that line. Now you could say we have unit tests
 for that. But isn't it the responsibility of an object to verify that
 it is being instantiated into a valid state?
 Also we don't know yet, if there always will be only one package in
 the library. What if, we add another package (o.a.c.csv.beanmapping or
 what ever) and we want to use CSVFormat there. Then we would have to
 make validate public, exposing it to the outside world.

  If you have some free time and want to do something fun you can try
  reimplementing the parser with less array copies. Commons CSV is still
  behind the other APIs on the performance side.
 

 I'll have a look at that, and at what Ted suggested on the other thread.

 TIA for taking your time to explain!!
 Benedikt

  Emmanuel Bourg
 

 -
 To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
 For additional commands, e-mail: dev-h...@commons.apache.org



 -
 To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
 For additional commands, e-mail: dev-h...@commons.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [csv] Why does CSVFormat provide a validate() method instead of validating parameters passed to its constructor?

2012-03-16 Thread James Carman
Yes this reminds me very much of the digester3 scenario.  Let's start
coding and quit talking!  As sebb pointed out, this is unreleased (at least
from our pmc) code.  Let's try on a few of the alternatives and see which
one looks/feels better (does this API make my butt look big?).
On Mar 16, 2012 10:53 AM, Matt Benson gudnabr...@gmail.com wrote:

 The votes show that several of us in the Commons community are leaning
 more and more toward fluent APIs these days.  But rather than wasting
 time going back and forth arguing about it, why not build the fluent
 API separately from the rest of the package and leave the choice up to
 the user?  I don't think I'm incorrect to say that this is more or
 less what [digester]3 does.

 Matt

 On Fri, Mar 16, 2012 at 7:34 AM, James Carman
 jcar...@carmanconsulting.com wrote:
  +1 for builder pattern and fluent API
  On Mar 16, 2012 4:24 AM, Benedikt Ritter benerit...@googlemail.com
  wrote:
 
  Am 15. März 2012 21:20 schrieb Emmanuel Bourg ebo...@apache.org:
   Le 15/03/2012 20:26, Benedikt Ritter a écrit :
  
  
   How about you Emmanuel? Could sebb convince you? ;-) How about this:
   I'll create a patch and attach it to JIRA. Then we'll have a better
   basis for discussion.
  
  
   Sorry but I'm not convinced. I see exactly where this leads.
  
 
  Hey Emmanuel,
 
  I accept that, and I respect your decision. I don't want to argue with
  you on this, I just want to understand your decision. From Effective
  Java, I learned if you are facing a constructor with lots of optional
  arguments, consider using the builder pattern. Can you explain, why
  you think it is not appropriate in this case? You said it is too
  verbose, but it's just one additional call, compared with what we have
  now.
 
  The advantage of the builder (sorry now I'm arguing :) is, that nobody
  has to remember to validate the format. Even if validation is
  something that is package private, that could lead to the point where
  someone forgets to add that line. Now you could say we have unit tests
  for that. But isn't it the responsibility of an object to verify that
  it is being instantiated into a valid state?
  Also we don't know yet, if there always will be only one package in
  the library. What if, we add another package (o.a.c.csv.beanmapping or
  what ever) and we want to use CSVFormat there. Then we would have to
  make validate public, exposing it to the outside world.
 
   If you have some free time and want to do something fun you can try
   reimplementing the parser with less array copies. Commons CSV is still
   behind the other APIs on the performance side.
  
 
  I'll have a look at that, and at what Ted suggested on the other thread.
 
  TIA for taking your time to explain!!
  Benedikt
 
   Emmanuel Bourg
  
 
  -
  To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
  For additional commands, e-mail: dev-h...@commons.apache.org
 
 

 -
 To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
 For additional commands, e-mail: dev-h...@commons.apache.org




Re: [csv] Why does CSVFormat provide a validate() method instead of validating parameters passed to its constructor?

2012-03-15 Thread Benedikt Ritter
Am 14. März 2012 22:47 schrieb sebb seb...@gmail.com:
 On 14 March 2012 21:40, Benedikt Ritter benerit...@googlemail.com wrote:
 Am 14. März 2012 22:33 schrieb Emmanuel Bourg ebo...@apache.org:
 Le 14/03/2012 22:25, Benedikt Ritter a écrit :


 I agree with you on this. However, I think it would be better to tie
 validation to the object creation. Maybe the Builder Pattern like
 shown in Effective Java p. 14-15 is a reasonable solution for this
 case? It would be a bit more verbose, but we can be sure that
 everything will be validated.


 That's too verbose, please let's keep this simple API.


 okay!
 although, I don't find
 CSVFormat format =
 CSVFormat.defaults().withDelimiter('#').withCommentStart('/').build()
 too verbose ;)

 Agree entirely.

 And parse and format could perform an implicit build().

 It would also allow one to eliminate the additional instance creation.


How about you Emmanuel? Could sebb convince you? ;-) How about this:
I'll create a patch and attach it to JIRA. Then we'll have a better
basis for discussion.

 Bonne nuit!

 Gute Nacht ?

 Benedikt

 Emmanuel Bourg


 -
 To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
 For additional commands, e-mail: dev-h...@commons.apache.org


 -
 To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
 For additional commands, e-mail: dev-h...@commons.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [csv] Why does CSVFormat provide a validate() method instead of validating parameters passed to its constructor?

2012-03-15 Thread Emmanuel Bourg

Le 15/03/2012 20:26, Benedikt Ritter a écrit :


How about you Emmanuel? Could sebb convince you? ;-) How about this:
I'll create a patch and attach it to JIRA. Then we'll have a better
basis for discussion.


Sorry but I'm not convinced. I see exactly where this leads.

If you have some free time and want to do something fun you can try 
reimplementing the parser with less array copies. Commons CSV is still 
behind the other APIs on the performance side.


Emmanuel Bourg



smime.p7s
Description: S/MIME Cryptographic Signature


Re: [csv] Why does CSVFormat provide a validate() method instead of validating parameters passed to its constructor?

2012-03-14 Thread Emmanuel Bourg

Le 14/03/2012 21:52, Benedikt Ritter a écrit :


the subject of this mail is pretty self-explanatory. Why do we need a
package private validate() method, given the fact, that users can not
create custom instances (constructor is package private)? You could
even argue, that no validation is needed at all, since we are in
control of what formats can be created. However I'd say, the
validation makes sure that we do not unintentionally create invalid
formats. But then validate() can be made private and called at the end
of the constructor.


Because the format could temporarily be in an invalid state. Something 
like this would break (a bit far fetched, I agree):


CSVFormat.DEFAULT.withDelimiter('#').withCommentStart('/');

Emmanuel Bourg



smime.p7s
Description: S/MIME Cryptographic Signature


Re: [csv] Why does CSVFormat provide a validate() method instead of validating parameters passed to its constructor?

2012-03-14 Thread Benedikt Ritter
Am 14. März 2012 22:02 schrieb Emmanuel Bourg ebo...@apache.org:
 Le 14/03/2012 21:52, Benedikt Ritter a écrit :


 the subject of this mail is pretty self-explanatory. Why do we need a
 package private validate() method, given the fact, that users can not
 create custom instances (constructor is package private)? You could
 even argue, that no validation is needed at all, since we are in
 control of what formats can be created. However I'd say, the
 validation makes sure that we do not unintentionally create invalid
 formats. But then validate() can be made private and called at the end
 of the constructor.


 Because the format could temporarily be in an invalid state. Something like
 this would break (a bit far fetched, I agree):

 CSVFormat.DEFAULT.withDelimiter('#').withCommentStart('/');


okay, I understand. But doesn't that make things worse? Users can
create invalid formats, but they can not call validate(), because it
is package private.

 Emmanuel Bourg


-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [csv] Why does CSVFormat provide a validate() method instead of validating parameters passed to its constructor?

2012-03-14 Thread sebb
On 14 March 2012 21:02, Emmanuel Bourg ebo...@apache.org wrote:
 Le 14/03/2012 21:52, Benedikt Ritter a écrit :


 the subject of this mail is pretty self-explanatory. Why do we need a
 package private validate() method, given the fact, that users can not
 create custom instances (constructor is package private)? You could
 even argue, that no validation is needed at all, since we are in
 control of what formats can be created. However I'd say, the
 validation makes sure that we do not unintentionally create invalid
 formats. But then validate() can be made private and called at the end
 of the constructor.


 Because the format could temporarily be in an invalid state. Something like
 this would break (a bit far fetched, I agree):

 CSVFormat.DEFAULT.withDelimiter('#').withCommentStart('/');

In which case, reversing the order would work.
We would just have to document this as a restriction.

The problem with leaving the validation until later is that it
decouples the cause and effect.
This makes it a bit harder to debug.

It's also simpler if there is a single validation method.
At present some of the setters also perform validation.

BTW, we should probably reject delimiter == DISABLED.

Also, the DISABLED constant needs to be public (or available via a
public getter) otherwise it's not possible to disable all but the
delimiter.
Using a getter would allow the constant to be changed if it became necessary.

 Emmanuel Bourg


-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [csv] Why does CSVFormat provide a validate() method instead of validating parameters passed to its constructor?

2012-03-14 Thread Benedikt Ritter
Am 14. März 2012 22:16 schrieb sebb seb...@gmail.com:
 On 14 March 2012 21:02, Emmanuel Bourg ebo...@apache.org wrote:
 Le 14/03/2012 21:52, Benedikt Ritter a écrit :


 the subject of this mail is pretty self-explanatory. Why do we need a
 package private validate() method, given the fact, that users can not
 create custom instances (constructor is package private)? You could
 even argue, that no validation is needed at all, since we are in
 control of what formats can be created. However I'd say, the
 validation makes sure that we do not unintentionally create invalid
 formats. But then validate() can be made private and called at the end
 of the constructor.


 Because the format could temporarily be in an invalid state. Something like
 this would break (a bit far fetched, I agree):

 CSVFormat.DEFAULT.withDelimiter('#').withCommentStart('/');

 In which case, reversing the order would work.
 We would just have to document this as a restriction.

 The problem with leaving the validation until later is that it
 decouples the cause and effect.
 This makes it a bit harder to debug.

 It's also simpler if there is a single validation method.
 At present some of the setters also perform validation.


I agree with you on this. However, I think it would be better to tie
validation to the object creation. Maybe the Builder Pattern like
shown in Effective Java p. 14-15 is a reasonable solution for this
case? It would be a bit more verbose, but we can be sure that
everything will be validated.

 BTW, we should probably reject delimiter == DISABLED.

 Also, the DISABLED constant needs to be public (or available via a
 public getter) otherwise it's not possible to disable all but the
 delimiter.
 Using a getter would allow the constant to be changed if it became necessary.


Only if we remove Serializable...

 Emmanuel Bourg


 -
 To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
 For additional commands, e-mail: dev-h...@commons.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [csv] Why does CSVFormat provide a validate() method instead of validating parameters passed to its constructor?

2012-03-14 Thread Emmanuel Bourg

Le 14/03/2012 22:15, Benedikt Ritter a écrit :


okay, I understand. But doesn't that make things worse? Users can
create invalid formats, but they can not call validate(), because it
is package private.


Worse? There was no validation originally, users could do any kind of 
absurd things.


I don't think this is an issue, the format will be validated one line 
later when it's used for parsing or printing.


Emmanuel Bourg



smime.p7s
Description: S/MIME Cryptographic Signature


Re: [csv] Why does CSVFormat provide a validate() method instead of validating parameters passed to its constructor?

2012-03-14 Thread Emmanuel Bourg

Le 14/03/2012 22:25, Benedikt Ritter a écrit :


I agree with you on this. However, I think it would be better to tie
validation to the object creation. Maybe the Builder Pattern like
shown in Effective Java p. 14-15 is a reasonable solution for this
case? It would be a bit more verbose, but we can be sure that
everything will be validated.


That's too verbose, please let's keep this simple API.

Emmanuel Bourg



smime.p7s
Description: S/MIME Cryptographic Signature


Re: [csv] Why does CSVFormat provide a validate() method instead of validating parameters passed to its constructor?

2012-03-14 Thread Benedikt Ritter
Am 14. März 2012 22:33 schrieb Emmanuel Bourg ebo...@apache.org:
 Le 14/03/2012 22:25, Benedikt Ritter a écrit :


 I agree with you on this. However, I think it would be better to tie
 validation to the object creation. Maybe the Builder Pattern like
 shown in Effective Java p. 14-15 is a reasonable solution for this
 case? It would be a bit more verbose, but we can be sure that
 everything will be validated.


 That's too verbose, please let's keep this simple API.


okay!
although, I don't find
CSVFormat format =
CSVFormat.defaults().withDelimiter('#').withCommentStart('/').build()
too verbose ;)

Bonne nuit!
Benedikt

 Emmanuel Bourg


-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [csv] Why does CSVFormat provide a validate() method instead of validating parameters passed to its constructor?

2012-03-14 Thread Emmanuel Bourg

Le 14/03/2012 22:16, sebb a écrit :


In which case, reversing the order would work.
We would just have to document this as a restriction.


Too much burden on the user.



The problem with leaving the validation until later is that it
decouples the cause and effect.
This makes it a bit harder to debug.


It'll break one line bellow the creation of the format in 99.99% of the 
cases, and the exception thrown is very explicit.




It's also simpler if there is a single validation method.
At present some of the setters also perform validation.


I don't see this as an issue.



BTW, we should probably reject delimiter == DISABLED.


Ok



Also, the DISABLED constant needs to be public (or available via a
public getter) otherwise it's not possible to disable all but the
delimiter.


I'd better use a null value to disable a feature than exposing the 
DISABLED constant.



Emmanuel Bourg



smime.p7s
Description: S/MIME Cryptographic Signature


Re: [csv] Why does CSVFormat provide a validate() method instead of validating parameters passed to its constructor?

2012-03-14 Thread sebb
On 14 March 2012 21:41, Emmanuel Bourg ebo...@apache.org wrote:
 Le 14/03/2012 22:16, sebb a écrit :


 In which case, reversing the order would work.
 We would just have to document this as a restriction.


 Too much burden on the user.



 The problem with leaving the validation until later is that it
 decouples the cause and effect.
 This makes it a bit harder to debug.


 It'll break one line bellow the creation of the format in 99.99% of the
 cases, and the exception thrown is very explicit.



 It's also simpler if there is a single validation method.
 At present some of the setters also perform validation.


 I don't see this as an issue.



 BTW, we should probably reject delimiter == DISABLED.


 Ok



 Also, the DISABLED constant needs to be public (or available via a
 public getter) otherwise it's not possible to disable all but the
 delimiter.


 I'd better use a null value to disable a feature than exposing the DISABLED
 constant.

It's not possible to use null for a char value.


 Emmanuel Bourg


-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [csv] Why does CSVFormat provide a validate() method instead of validating parameters passed to its constructor?

2012-03-14 Thread sebb
On 14 March 2012 21:40, Benedikt Ritter benerit...@googlemail.com wrote:
 Am 14. März 2012 22:33 schrieb Emmanuel Bourg ebo...@apache.org:
 Le 14/03/2012 22:25, Benedikt Ritter a écrit :


 I agree with you on this. However, I think it would be better to tie
 validation to the object creation. Maybe the Builder Pattern like
 shown in Effective Java p. 14-15 is a reasonable solution for this
 case? It would be a bit more verbose, but we can be sure that
 everything will be validated.


 That's too verbose, please let's keep this simple API.


 okay!
 although, I don't find
 CSVFormat format =
 CSVFormat.defaults().withDelimiter('#').withCommentStart('/').build()
 too verbose ;)

Agree entirely.

And parse and format could perform an implicit build().

It would also allow one to eliminate the additional instance creation.

 Bonne nuit!

Gute Nacht ?

 Benedikt

 Emmanuel Bourg


 -
 To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
 For additional commands, e-mail: dev-h...@commons.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [csv] Why does CSVFormat provide a validate() method instead of validating parameters passed to its constructor?

2012-03-14 Thread Emmanuel Bourg

Le 14/03/2012 22:44, sebb a écrit :


It's not possible to use null for a char value.


It assumed we changed the signature of the methods with 
java.lang.Character, while retaining the primitive internally. But 
that's not possible due to the getters.


If possible I'd like to not expose this constant. The idea was that we 
start with disabled features by default in the predefined formats, and 
then we enable what's necessary. Thus we never have to disable anything.


Emmanuel Bourg



smime.p7s
Description: S/MIME Cryptographic Signature


Re: [csv] Why does CSVFormat provide a validate() method instead of validating parameters passed to its constructor?

2012-03-14 Thread sebb
On 14 March 2012 21:56, Emmanuel Bourg ebo...@apache.org wrote:
 Le 14/03/2012 22:44, sebb a écrit :


 It's not possible to use null for a char value.


 It assumed we changed the signature of the methods with java.lang.Character,
 while retaining the primitive internally. But that's not possible due to the
 getters.

I've just realised that the DISABLED value is actually exposed - just
use a format with a disabled character and fetch it.

It's not too late to change to using Character.

 If possible I'd like to not expose this constant. The idea was that we start
 with disabled features by default in the predefined formats, and then we
 enable what's necessary. Thus we never have to disable anything.

It's not possible currently to create a format with encapsulator,
commentStart and escape all null, except by knowing the value of
DISABLED.

Maybe that's not allowed, but it's not rejected by validation, and can
be created by using '\ufffe' directly.

Likewise, encapsulator and escape are not mutually exclusive.
Should they be?

 Emmanuel Bourg

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [csv] Why does CSVFormat provide a validate() method instead of validating parameters passed to its constructor?

2012-03-14 Thread Emmanuel Bourg

Le 14/03/2012 23:35, sebb a écrit :


It's not too late to change to using Character.


The issue is the parser, this will probably degrade the performance. 
Unless the fields are made package private and accessed directly by the 
parser.




It's not possible currently to create a format with encapsulator,
commentStart and escape all null, except by knowing the value of
DISABLED.


The solution might be to introduce a quoting mode to the format. I 
planned to add this for the printer, but it can be useful for the parser 
too. This mode would have 3 states:

- NEVER: Don't use quotes, even if the encapsulator is defined
- ALWAYS: Always enclose values into quotes
- REQUIRED: Enclose the values only if necessary

Thus the quotes could be disabled with:

CSVFormat.DEFAULT.withEncapsulation(NEVER)



Likewise, encapsulator and escape are not mutually exclusive.
Should they be?


I wish it was, but MySQL actually produces files with quotes and escaped 
characters (delimiter and line feeds). I'm reviewing other RDBMS to see 
what formats we can expect in the wild.


Emmanuel Bourg




smime.p7s
Description: S/MIME Cryptographic Signature


Re: [csv] Why does CSVFormat provide a validate() method instead of validating parameters passed to its constructor?

2012-03-14 Thread sebb
On 14 March 2012 22:54, Emmanuel Bourg ebo...@apache.org wrote:
 Le 14/03/2012 23:35, sebb a écrit :


 It's not too late to change to using Character.


 The issue is the parser, this will probably degrade the performance. Unless
 the fields are made package private and accessed directly by the parser.

I don't see why, so long as you fetch the values once at the start.

There's a theoretical problem with using a valid char value as a
disabled indicator, at least with the parser as it stands.
It assumes that the disabled char cannot occur in a file; that is not
strictly true, so it could detect an escape where there is none.

The example string pu\\ufffeblic is parsed as puBELlic when using
CSVFormat.TDF.withUnicodeEscapesInterpreted(true) - i.e. the disabled
char is treated as an escape, and \b = BEL.

This could be avoided by checking isEscaping in the parser; similarly
for the other chars that can be disabled.



 It's not possible currently to create a format with encapsulator,
 commentStart and escape all null, except by knowing the value of
 DISABLED.


 The solution might be to introduce a quoting mode to the format. I planned
 to add this for the printer, but it can be useful for the parser too. This
 mode would have 3 states:
 - NEVER: Don't use quotes, even if the encapsulator is defined

I think it would be confusing to have a format with an encapsulator
that is not used.

 - ALWAYS: Always enclose values into quotes
 - REQUIRED: Enclose the values only if necessary

 Thus the quotes could be disabled with:

 CSVFormat.DEFAULT.withEncapsulation(NEVER)

Or provide a constant format with all 3 disabled.

But it would still be simpler to be able to override each and every
char independently; using Character is going to be the simplest way to
achieve that.

Probably still need to make output encapsulation switchable, but
that's a different matter.



 Likewise, encapsulator and escape are not mutually exclusive.
 Should they be?


 I wish it was, but MySQL actually produces files with quotes and escaped
 characters (delimiter and line feeds). I'm reviewing other RDBMS to see what
 formats we can expect in the wild.

OK.

 Emmanuel Bourg



-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [csv] Why does CSVFormat provide a validate() method instead of validating parameters passed to its constructor?

2012-03-14 Thread Emmanuel Bourg

Le 15/03/2012 00:32, sebb a écrit :


I don't see why, so long as you fetch the values once at the start.


I tried that and the parser was slower. Don't ask me why.



There's a theoretical problem with using a valid char value as a
disabled indicator, at least with the parser as it stands.
It assumes that the disabled char cannot occur in a file; that is not
strictly true, so it could detect an escape where there is none.

The example string pu\\ufffeblic is parsed as puBELlic when using
CSVFormat.TDF.withUnicodeEscapesInterpreted(true) - i.e. the disabled
char is treated as an escape, and \b =BEL.

This could be avoided by checking isEscaping in the parser; similarly
for the other chars that can be disabled.


Good point, I didn't see this one. It can also be avoided by filtering 
invalid sequences in UnicodeUnescapeReader (or \ufffe specifically). 
Character.isDefined(char) will tell us if the character exists. U+FFFE 
doesn't:


http://www.fileformat.info/info/unicode/char/fffe

Emmanuel Bourg



smime.p7s
Description: S/MIME Cryptographic Signature