Re: [Python-Dev] These csv test cases seem incorrect to me...

2007-03-14 Thread M.-A. Lemburg
Hi Skip,

On 2007-03-12 03:01, [EMAIL PROTECTED] wrote:
> I decided it would be worthwhile to have a csv module written in Python (no
> C underpinnings) for a number of reasons:
> 
> * It will probably be easier to add Unicode support to a Python version
> 
> * More people will be able to read/grok/modify/fix bugs in a Python
>   implementation than in the current mixed Python/C implementation.
> 
> * With alternative implementations of Python available (PyPy,
>   IronPython, Jython) it makes sense to have a Python version they can
>   use.

Lots of good reasons :-)

I've written a Python-only Unicode aware CSV module for a client (mostly
because CSV data tends to be quirky and I needed a quick way of dealing
with corner cases). Perhaps I can get them to donate it to the PSF...

> I'm far from having anything which will pass the current test suite, but in
> diagnosing some of my current failures I noticed a couple test cases which
> seem wrong.  In the TestDialectExcel class I see these two questionable
> tests:
> 
> def test_quotes_and_more(self):
> self.readerAssertEqual('"a"b', [['ab']])
> 
> def test_quote_and_quote(self):
> self.readerAssertEqual('"a" "b"', [['a "b"']])
> 
> It seems to me that if a field starts with a quote it *has* to be a quoted
> field.  Any quotes appearing within a quoted field have to be escaped and
> the field has to end with a quote.  Both of these test cases fail on or the
> other assumption.  If they are indeed both correct and I'm just looking at
> things crosseyed I think they at least deserve comments explaining why they
> are correct.
> 
> Both test cases date from the first checkin.  I performed the checkin
> because of the group developing the module I believe I was the only one with
> checkin privileges at the time, not because I wrote the test cases.
> 
> Any ideas about why these test cases are in there?  I can't imagine Excel
> generating either one.

My recommendation: Let the module do whatever Excel does with such data.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Mar 14 2007)
>>> Python/Zope Consulting and Support ...http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


 Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! 
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] These csv test cases seem incorrect to me...

2007-03-11 Thread skip

>> I'm far from having anything which will pass the current test suite,
>> but in diagnosing some of my current failures I noticed a couple test
>> cases which seem wrong.  In the TestDialectExcel class I see these
>> two questionable tests:
>> 
>> def test_quotes_and_more(self):
>> self.readerAssertEqual('"a"b', [['ab']])
>> 
>> def test_quote_and_quote(self):
>> self.readerAssertEqual('"a" "b"', [['a "b"']])

Andrew> The point was to produce the same results as Excel. Sure, Excel
Andrew> probably doesn't generate crap like this itself, but 3rd parties
Andrew> do, and people complain if we don't parse it just like Excel
Andrew> (sigh).

(sigh) indeed.

Thanks,

Skip
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] These csv test cases seem incorrect to me...

2007-03-11 Thread Jon Ribbens
Andrew McNamara <[EMAIL PROTECTED]> wrote:
> The point was to produce the same results as Excel. Sure, Excel probably
> doesn't generate crap like this itself, but 3rd parties do, and people
> complain if we don't parse it just like Excel (sigh).

The slight problem with copying Excel is that Excel won't parse its
own CSV output.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] These csv test cases seem incorrect to me...

2007-03-11 Thread Andrew McNamara
>I decided it would be worthwhile to have a csv module written in Python (no
>C underpinnings) for a number of reasons:

Several other people have already done this. I will forward you their
e-mail address in a separate private e-mail.

>I'm far from having anything which will pass the current test suite, but in
>diagnosing some of my current failures I noticed a couple test cases which
>seem wrong.  In the TestDialectExcel class I see these two questionable
>tests:
>
>def test_quotes_and_more(self):
>self.readerAssertEqual('"a"b', [['ab']])
>
>def test_quote_and_quote(self):
>self.readerAssertEqual('"a" "b"', [['a "b"']])
[...]
>Any ideas about why these test cases are in there?  I can't imagine Excel
>generating either one.

The point was to produce the same results as Excel. Sure, Excel probably
doesn't generate crap like this itself, but 3rd parties do, and people
complain if we don't parse it just like Excel (sigh).

-- 
Andrew McNamara, Senior Developer, Object Craft
http://www.object-craft.com.au/
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] These csv test cases seem incorrect to me...

2007-03-11 Thread skip

I decided it would be worthwhile to have a csv module written in Python (no
C underpinnings) for a number of reasons:

* It will probably be easier to add Unicode support to a Python version

* More people will be able to read/grok/modify/fix bugs in a Python
  implementation than in the current mixed Python/C implementation.

* With alternative implementations of Python available (PyPy,
  IronPython, Jython) it makes sense to have a Python version they can
  use.

I'm far from having anything which will pass the current test suite, but in
diagnosing some of my current failures I noticed a couple test cases which
seem wrong.  In the TestDialectExcel class I see these two questionable
tests:

def test_quotes_and_more(self):
self.readerAssertEqual('"a"b', [['ab']])

def test_quote_and_quote(self):
self.readerAssertEqual('"a" "b"', [['a "b"']])

It seems to me that if a field starts with a quote it *has* to be a quoted
field.  Any quotes appearing within a quoted field have to be escaped and
the field has to end with a quote.  Both of these test cases fail on or the
other assumption.  If they are indeed both correct and I'm just looking at
things crosseyed I think they at least deserve comments explaining why they
are correct.

Both test cases date from the first checkin.  I performed the checkin
because of the group developing the module I believe I was the only one with
checkin privileges at the time, not because I wrote the test cases.

Any ideas about why these test cases are in there?  I can't imagine Excel
generating either one.

Thx,

Skip

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com