[
https://issues.apache.org/jira/browse/CASSANDRA-12174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15391869#comment-15391869
]
Hiroyuki Nishi commented on CASSANDRA-12174:
--------------------------------------------
Hi [~Stefania],
Thanks for your response.
I changed the patch as the following.
https://github.com/yhnishi/cassandra/commit/db75d9dd0d74d3476d500f6b99c22e117dc73ec6
Below is sample results.
Success:
{code}
cqlsh> COPY test.airplanes (name, manufacturer, year, mach) FROM '/tmp/1.csv';
Using 7 child processes
Starting copy of test.airplanes with columns [name, manufacturer, year, mach].
Processed: 1 rows; Rate: 2 rows/s; Avg. rate: 2 rows/s
1 rows imported from 1 files in 0.420 seconds (0 skipped).
cqlsh> COPY test.airplanes (name, manufacturer, year, mach) FROM
'/tmp/1.csv,/tmp/2.csv';
Using 7 child processes
Starting copy of test.airplanes with columns [name, manufacturer, year, mach].
Processed: 2 rows; Rate: 3 rows/s; Avg. rate: 5 rows/s
2 rows imported from 2 files in 0.418 seconds (0 skipped).
cqlsh> COPY test.airplanes (name, manufacturer, year, mach) FROM '/tmp/*.csv';
Using 7 child processes
Starting copy of test.airplanes with columns [name, manufacturer, year, mach].
Processed: 2 rows; Rate: 3 rows/s; Avg. rate: 5 rows/s
2 rows imported from 2 files in 0.413 seconds (0 skipped).
{code}
Error:
{code}
cqlsh> COPY test.airplanes (name, manufacturer, year, mach) FROM
'/tmp/1234-doesnotexist';
Using 7 child processes
Starting copy of test.airplanes with columns [name, manufacturer, year, mach].
Failed to import 0 rows: IOError - Can't open '/tmp/1234-doesnotexist' for
reading: file does not exist, given up after 1 attempts
Processed: 0 rows; Rate: 0 rows/s; Avg. rate: 0 rows/s
0 rows imported from 0 files in 0.218 seconds (0 skipped).
cqlsh> COPY test.airplanes (name, manufacturer, year, mach) FROM
'/tmp/*-doesnotexist';
Using 7 child processes
Starting copy of test.airplanes with columns [name, manufacturer, year, mach].
Failed to import 0 rows: IOError - Can't open '/tmp/*-doesnotexist' for
reading: file does not exist, given up after 1 attempts
Processed: 0 rows; Rate: 0 rows/s; Avg. rate: 0 rows/s
0 rows imported from 0 files in 0.218 seconds (0 skipped).
cqlsh> COPY test.airplanes (name, manufacturer, year, mach) FROM
'/tmp/1234-doesnotexist,/tmp/1235-doesnotexist';
Using 7 child processes
Starting copy of test.airplanes with columns [name, manufacturer, year, mach].
Failed to import 0 rows: IOError - Can't open '/tmp/1234-doesnotexist' for
reading: file does not exist, given up after 1 attempts
Processed: 0 rows; Rate: 0 rows/s; Avg. rate: 0 rows/s
0 rows imported from 0 files in 0.217 seconds (0 skipped).
cqlsh> COPY test.airplanes (name, manufacturer, year, mach) FROM
'/tmp/1.csv,/tmp/*-doesnotexist';
Using 7 child processes
Starting copy of test.airplanes with columns [name, manufacturer, year, mach].
Failed to import 0 rows: IOError - Can't open '/tmp/*-doesnotexist' for
reading: file does not exist, given up after 1 attempts
Processed: 0 rows; Rate: 0 rows/s; Avg. rate: 0 rows/s
0 rows imported from 1 files in 0.219 seconds (0 skipped).
{code}
Please check the patch once again.
> COPY FROM should raise error for non-existing input files
> ---------------------------------------------------------
>
> Key: CASSANDRA-12174
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12174
> Project: Cassandra
> Issue Type: Improvement
> Components: Tools
> Reporter: Stefan Podkowinski
> Assignee: Hiroyuki Nishi
> Priority: Minor
> Labels: lhf
> Attachments: CASSANDRA-12174-trunk.patch
>
>
> Currently the CSV COPY FROM command will not raise any error for non-existing
> paths. Instead only "0 rows imported" will be shown as result.
> As the COPY FROM command is often used for tutorials and getting started
> guides, I'd suggest to give a clear error message in case of a missing input
> file. Without such error it can be confusing for the user to see the command
> actually finish, without any clues why no rows have been imported.
> {noformat}
> CREATE KEYSPACE test
> WITH REPLICATION = { 'class' : 'NetworkTopologyStrategy', 'datacenter1' : 1
> };
> USE test;
> CREATE TABLE airplanes (
> name text PRIMARY KEY,
> manufacturer ascii,
> year int,
> mach float
> );
> COPY airplanes (name, manufacturer, year, mach) FROM '/tmp/1234-doesnotexist';
> Using 3 child processes
> Starting copy of test.airplanes with columns [name, manufacturer, year, mach].
> Processed: 0 rows; Rate: 0 rows/s; Avg. rate: 0 rows/s
> 0 rows imported from 0 files in 0.216 seconds (0 skipped).
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)