Hi Jacob, Thanks. To provide some specifics on my query:
1.which version of arrow are you running? - 10.0.1 2. The error message provides an exact col,row position, have you checked the value there? Yes. It is int64. This is after running open_dataset without specifying schema: ''' arrow<-open_dataset( sources="location of csv files", format="csv" ) ''' 3. I have to correct the exact error message: CSV conversion error to int64:invalid value ' ' I think arrow tells me the invalid value present is ' ' 4. This reminds me of cases where scientific notation is used for integers which causes an error but that usually shows the value e.g. "1e6". the invalid value is: ' ' 5. I am really confused because using disk.frame() function, on the same csvs, I have not encountered this problem on this column because it was cleanly encoded as a numeric variable. Regards, On Fri, Jan 27, 2023 at 9:43 AM Angelo Casalan <acasalan...@gmail.com> wrote: > Hi , > > I hope you are well. I wish to ask how I can resolve this error: > > "CSV conversion error to int64: invalid value" > > > To give an idea of my dataset. I have 4 csvs all placed in a local folder. > > > The code below worked when importing: > > > arrow<-open_dataset( > sources="csv location", > format="csv") > > > However, when I run: > > > arrow %>% count(column) %>% collect() > nrow(arrow %>% collect) > > head(arrow %>% collect(),10 ) > > I always get the same error message: "Invalid: In CSV column #12: Row > #580. CSV conversion error to int64: invalid value" > > I tried going back to open_dataset(,schema() ). Where the column that is > giving me problems is set as utf8 or sometimes str in the schema argument. > > schema( > col=utf8(), > other nth columns > ) > > But I still encounter the same problem. > > Using this code below fail to work either. > > arrow2<-arrow_table(arrow) > > Thanks in advance if you can help me. > > -- > Regards, > > Angelo Casalan > Statistical Methodology Unit > -- Regards, Angelo Casalan Statistical Methodology Unit