Ruilong Huo created HAWQ-280:
--------------------------------
Summary: Error accessing external table or copying from file with
bad rows
Key: HAWQ-280
URL: https://issues.apache.org/jira/browse/HAWQ-280
Project: Apache HAWQ
Issue Type: Bug
Components: External Tables
Reporter: Ruilong Huo
Assignee: Lei Chang
It errors out without return result when accessing external table or copying
from file with bad rows.
1. Error accessing external table with bad rows
```
Step 1: download attached test.csv with 2000 row which are all bad formated
Step 2: start gpfdist service
gpfdist -d /home/gpadmin/data/ -p 8081 -l /home/gpadmin/log/load.log &
------------------------------------------------------------------------------------------------
[1] 34635
Serving HTTP on port 8081, directory /home/gpadmin/data
Step 3: create external table
CREATE EXTERNAL TABLE test_ext (id INT, a TEXT, b TEXT, c TEXT, z TEXT)
LOCATION ('gpfdist://localhost:8081/test.csv')
FORMAT 'CSV'
LOG ERRORS INTO test_ext_err SEGMENT REJECT LIMIT 3000 ROWS;
-----------------------------------------------------------------------------------------------------
NOTICE: Error table "test_ext_err" does not exist. Auto generating an error
table with the same name
CREATE EXTERNAL TABLE
Step 4: access external table
SELECT COUNT(*) FROM test_ext;
-------------------------------------------------
ERROR: All 1000 first rows in this segment were rejected. Aborting operation
regardless of REJECT LIMIT value. Last error was: missing data for column "z"
(seg0 localhost:40000 pid=35647)
DETAIL: External table test_ext, line 1000 of
gpfdist://localhost:8081/test.csv: "29,aaa,bbb,zzz"
```
2. Error copying from file with bad rows
```
Step 1: download attached test.csv with 2000 row which are all bad formated
Step 2: create table
CREATE TABLE test_copy (id INT, a TEXT, b TEXT, c TEXT, z TEXT);
------------------------------------------------------------------------------------------------
CREATE TABLE
Step 3: copy data in file to table in database
COPY test_copy FROM '/Users/intern/Downloads/test.csv' LOG ERRORS INTO
test_copy_err SEGMENT REJECT LIMIT 3000 ROWS;
--------------------------------------------------------------------------------------------------------
NOTICE: Error table "test_copy_err" does not exist. Auto generating an error
table with the same name
WARNING: The error table was created in the same transaction as this
operation. It will get dropped if transaction rolls back even if bad rows are
present
HINT: To avoid this create the error table ahead of time using: CREATE TABLE
<name> (cmdtime timestamp with time zone, relname text, filename text, linenum
integer, bytenum integer, errmsg text, rawdata text, rawbytes bytea)
ERROR: All 1000 first rows in this segment were rejected. Aborting operation
regardless of REJECT LIMIT value. Last error was: missing data for column "a"
CONTEXT: COPY test_copy, line 1000: "29,aaa,bbb,zzz"
```
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)