Hello everyone,

I am trying to load the IMDb dataset in AsterixDB. It seems that some of
the rows end up with broken escaping and eventually not being inserted at
all. For example, I used the syntax as follows:

LOAD DATASET movie_companies using localfs (
("path"=asterix_nc1://imdb-data/movie-companies.csv),
("format"="delimited-text"),("delimiter"=","), ("null"="")
);

The schema is movie_companies (id: int, movie_id: int, company_id: int,
company_type_id: int, note: string) and the CSV file contains the following
row:

13893, 53192, 1376, 1, "(1986) (USA) (VHS) (included in \"The Best Of
Alfred Hitchcock, Vol. One\")"

This row ends up not loading at all. The rest of the row with no such
string input can be loaded successfully.

Any suggestions?

Thanks,
Mehnaz

Reply via email to