Pierre GM writes:

> On Nov 11, 2010, at 8:31 PM, Lluís wrote:

>> Pierre GM writes:
>> 
>>> In practice, that's exactly what happens below the hood when
>>> genfromtxt tries to guess the output type of the converter. It tries a
>>> single value ('1'), fails, and decides that the result must be an
>>> object... Probably not the best strategy, as it crashes in your
>>> case. But yours is a buggy case anyway.
>> [...]
>>> Now, we can argue over the very last point: if both a converter and a
>>> dtype are specified, which one should take precedence?
>>> You have my opinion, let's hear yours.
>> 
>> What about delaying the calculation of converters?
>> Instead of using type
>> checks with fake data, 'StringConverter.update' could take an optional
>> argument 'imput_sample' (defaulting to "1") in order to perform its
>> checks.
>> 
>> Then, use real data from the first (non-comment, non-names) line of the
>> input file when calling 'StringConverter.update' in 'genfromtxt'.

> Mmh. That's an idea... Do you have a patch to suggest?

This will work as long as 'first_values' is assured to always contain
valid data and as long as its indexes are equivalent to those in
converters (which I simply haven't checked).

--- numpy/lib/_iotools.py.bak	2010-11-14 21:17:12.000000000 +0100
+++ numpy/lib/_iotools.py	2010-11-14 21:19:02.000000000 +0100
@@ -672,7 +672,7 @@
             self._status = _status
             self.iterupgrade(value)
 
-    def update(self, func, default=None, missing_values='', locked=False):
+    def update(self, func, default=None, missing_values='', locked=False, input_sample='1'):
         """
         Set StringConverter attributes directly.
 
@@ -689,6 +689,7 @@
         locked : bool, optional
             Whether the StringConverter should be locked to prevent automatic
             upgrade or not. Default is False.
+        input_sample : A sample input to test the validity of `func`.
 
         Notes
         -----
@@ -705,7 +706,7 @@
             self.type = self._getsubdtype(default)
         else:
             try:
-                tester = func('1')
+                tester = func(input_sample)
             except (TypeError, ValueError):
                 tester = None
             self.type = self._getsubdtype(tester)
--- numpy/lib/io.py.bak	2010-11-14 21:14:05.000000000 +0100
+++ numpy/lib/io.py	2010-11-14 21:24:37.000000000 +0100
@@ -1235,7 +1235,8 @@
                 continue
         converters[i].update(conv, locked=True,
                              default=filling_values[i],
-                             missing_values=missing_values[i],)
+                             missing_values=missing_values[i],
+                             input_sample=first_values[i])
         uc_update.append((i, conv))
     # Make sure we have the corrected keys in user_converters...
     user_converters.update(uc_update)
apa!

-- 
 "And it's much the same thing with knowledge, for whenever you learn
 something new, the whole world becomes that much richer."
 -- The Princess of Pure Reason, as told by Norton Juster in The Phantom
 Tollbooth
_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Reply via email to