Jose Martinez Poblete created CASSANDRA-7407:
------------------------------------------------
Summary: COPY command does not work properly with collections
causing failure to import data
Key: CASSANDRA-7407
URL: https://issues.apache.org/jira/browse/CASSANDRA-7407
Project: Cassandra
Issue Type: Bug
Components: Core
Environment: cqlsh 4.1.1,
Cassandra 2.0.7.31,
CQL spec 3.1.1,
Thrift protocol 19.39.0
Reporter: Jose Martinez Poblete
The COPY command does not properly format collections in the output CSV - to be
able to re-import the data.
Here is how you can replicate the problem:
{noformat}
CREATE TABLE user_colors (
user_id int PRIMARY KEY,
colors list<ascii>
);
UPDATE user_colors SET colors = ['red','blue'] WHERE user_id=5;
UPDATE user_colors SET colors = ['purple','yellow'] WHERE user_id=6;
UPDATE user_colors SET colors = ['black''] WHERE user_id=7;
COPY user_colors (user_id, colors) TO 'output.csv';
CREATE TABLE user_colors2 (
user_id int PRIMARY KEY,
colors list<ascii>
);
COPY user_colors2 (user_id, colors ) FROM 'user_colors.csv';
Bad Request: line 1:68 no viable alternative at input ']'
Aborting import at record #0 (line 1). Previously-inserted values still present.
0 rows imported in 0.007 seconds.
{noformat}
The CSV file seems to be malformed
- The single quotes within the collection are missing
- The double quotes for collection on user_id=7 are missing and causing COPY to
fail.
{noformat}
5,"[red, blue]"
7,[black]
6,"[purple, yellow]"
{noformat}
Should be like this
{noformat}
5,"['red', 'blue']"
7,"['black']"
6,"['purple', 'yellow']"
{noformat}
Once the file is changed, the import works
{noformat}
COPY user_colors2 (user_id, colors ) FROM 'user_colors.csv';
3 rows imported in 0.012 seconds.
SELECT * FROM user_colors2;
user_id | colors
---------+------------------
5 | [red, blue]
7 | [black]
6 | [purple, yellow]
(3 rows)
{noformat}
--
This message was sent by Atlassian JIRA
(v6.2#6252)