data_file["data"], this works only if you have such a column as well. load_csv can perfectly do what you need, but you have to adapt the script to what you have in the csv (which is something only you know!). You need to understand what the different statements are doing; just as you need to understand what processing you apply on your data (whether it's preprocessing or learning) to properly use any machine learning tool.
Matthieu Le dim. 8 nov. 2020 à 12:44, Mahmood Naderan <mahmood...@gmail.com> a écrit : > > Thanks for the replies. > > >I'd recommend just reading that csv file with e.g. pandas > >(https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html), > >and then just use the dataframe as input to scikit-learn utilities (you may > >need to > >separate the features X from the target y). > > > I am trying to follow the steps as described in > https://towardsdatascience.com/a-step-by-step-introduction-to-pca-c0d78e26a0dd > > I changed > > iris = load_iris() > colors = ["blue","red","green"] > df = DataFrame( > data=np.c_[iris["data"], iris["target"]], columns= iris["feature_names"] > + ["target"]) > > to > > data_file = pd.read_csv("mydata.csv") > colors = > ["blue","red","green","skyblue","indigo","plum","coral","orange","gray","lime"] > df = DataFrame( > data=np.c_[data_file["data"], data_file["target"]], > columns=data_file["feature_names"] + ["target"]) > > > But I get this error: > > Traceback (most recent call last): > File > "/home/mahmood/.local/lib/python3.6/site-packages/pandas/core/indexes/base.py", > line 2895, in get_loc > return self._engine.get_loc(casted_key) > File "pandas/_libs/index.pyx", line 70, in > pandas._libs.index.IndexEngine.get_loc > File "pandas/_libs/index.pyx", line 101, in > pandas._libs.index.IndexEngine.get_loc > File "pandas/_libs/hashtable_class_helper.pxi", line 1675, in > pandas._libs.hashtable.PyObjectHashTable.get_item > File "pandas/_libs/hashtable_class_helper.pxi", line 1683, in > pandas._libs.hashtable.PyObjectHashTable.get_item > KeyError: 'data' > > The above exception was the direct cause of the following exception: > > Traceback (most recent call last): > File "pca_gromacs.py", line 12, in <module> > data=np.c_[data_file["data"], data_file["target"]], > columns=data_file["feature_names"] + ["target"] > File > "/home/mahmood/.local/lib/python3.6/site-packages/pandas/core/frame.py", line > 2906, in __getitem__ > indexer = self.columns.get_loc(key) > File > "/home/mahmood/.local/lib/python3.6/site-packages/pandas/core/indexes/base.py", > line 2897, in get_loc > raise KeyError(key) from err > KeyError: 'data' > > > > It seems that load_iris() do more than read_csv(). > > Regards, > Mahmood > > > > _______________________________________________ > scikit-learn mailing list > scikit-learn@python.org > https://mail.python.org/mailman/listinfo/scikit-learn -- Quantitative researcher, Ph.D. Blog: http://blog.audio-tk.com/ LinkedIn: http://www.linkedin.com/in/matthieubrucher _______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn