On Thu, May 7, 2015 at 2:26 AM, Dammy <damilarefagb...@gmail.com> wrote: > Hi, > I am trying to use numpy.gentxt to import a csv for classification using > scikit-learn. The first column in the csv is a string type class label while > 200+ extra columns are integer features. > Please I wish to find out how I can use the gentext function to specify a > dtype of string for the first column while specifying int type for all other > columns. > > I have tried using "dtype=None" as shown below, but when I print > dataset.shape, I get (number_or_rows,) i.e no columns are read in: > dataset = np.genfromtxt(file,delimiter=',', skip_header=True) > > I also tried setting the dtypes as shown in the examples below, but I get > the same error as dtype=None:
these dtypes will create structured arrays: http://docs.scipy.org/doc/numpy/user/basics.rec.html so it is expected that the shape is the number of rows, the colums are part of the dtype and can be accessed like a dictionary: In [21]: d = np.ones(3, dtype='S2, int8') In [22]: d Out[22]: array([('1', 1), ('1', 1), ('1', 1)], dtype=[('f0', 'S2'), ('f1', 'i1')]) In [23]: d.shape Out[23]: (3,) In [24]: d.dtype.names Out[24]: ('f0', 'f1') In [25]: d[0] Out[25]: ('1', 1) In [26]: d['f0'] Out[26]: array(['1', '1', '1'], dtype='|S2') In [27]: d['f1'] Out[27]: array([1, 1, 1], dtype=int8) _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion