[
https://issues.apache.org/jira/browse/CASSANDRA-10854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15067846#comment-15067846
]
Stefania commented on CASSANDRA-10854:
--------------------------------------
There's a slight problem with the new code of CASSANDRA-9302, once that's fixed
here is the error message:
{code}
COPY tracks_by_album (album_title, album_year, performer, album_genre,
track_number, track_title) FROM '10854.csv' WITH HEADER = 'true';
[...]
Failed to import 1 rows: ValueError - Cannot insert null value for primary key
column 'track_number'. If you want to insert empty strings, consider using the
WITH NULL=<marker> option for COPY. - given up after 1 attempts
[...]
{code}
However, because track_number is an integer, in this case we cannot import a
null string even with a null marker:
{code}
COPY tracks_by_album (album_title, album_year, performer, album_genre,
track_number, track_title) FROM '10854.csv' WITH HEADER = 'true' AND NULL =
'xxx';
[...]
Failed to import 1 rows: ValueError - invalid literal for int() with base 10:
'' - given up after 1 attempts
[...]
{code}
So the csv is incorrect.
As for the error messages, here is the [2.1
patch|https://github.com/stef1927/cassandra/commits/10854-2.1] and the [dtest
results|http://cassci.datastax.com/view/Dev/view/stef1927/job/stef1927-10854-2.1-dtest/]
of this [modified
branch|https://github.com/stef1927/cassandra-dtest/commits/10854]
The patch applies to 2.2 and is fairly trivial so I did not run CI on 2.2+.
> cqlsh COPY FROM csv having line with more than one consecutive ',' delimiter
> is throwing 'list index out of range'
> --------------------------------------------------------------------------------------------------------------------
>
> Key: CASSANDRA-10854
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10854
> Project: Cassandra
> Issue Type: Bug
> Components: Tools
> Environment: cqlsh 5.0.1 | Cassandra 2.1.11.969 | DSE 4.8.3 | CQL
> spec 3.2.1
> Reporter: Puspendu Banerjee
> Assignee: Stefania
> Priority: Minor
>
> cqlsh COPY FROM csv having line with more than one consecutive ',' delimiter
> is throwing 'list index out of range'
> Steps to re-produce:
> {code}
> CREATE TABLE tracks_by_album (
> album_title TEXT,
> album_year INT,
> performer TEXT STATIC,
> album_genre TEXT STATIC,
> track_number INT,
> track_title TEXT,
> PRIMARY KEY ((album_title, album_year), track_number)
> );
> {code}
> Create a file: tracks_by_album.csv having following 2 lines :
> {code}
> album,year,performer,genre,number,title
> a,2015,b c d,e f g,,
> {code}
> {code}
> cqlsh> COPY music.tracks_by_album
> (album_title, album_year, performer, album_genre, track_number,
> track_title)
> FROM '~/tracks_by_album.csv'
> WITH HEADER = 'true';
> Error :
> Starting copy of music.tracks_by_album with columns ['album_title',
> 'album_year', 'performer', 'album_genre', 'track_number', 'track_title'].
> list index out of range
> Aborting import at record #1. Previously inserted records are still present,
> and some records after that may be present as well.
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)