Ruilong Huo created HAWQ-280:
--------------------------------

             Summary: Error accessing external table or copying from file with 
bad rows
                 Key: HAWQ-280
                 URL: https://issues.apache.org/jira/browse/HAWQ-280
             Project: Apache HAWQ
          Issue Type: Bug
          Components: External Tables
            Reporter: Ruilong Huo
            Assignee: Lei Chang


It errors out without return result when accessing external table or copying 
from file with bad rows.

1. Error accessing external table with bad rows
```
Step 1: download attached test.csv with 2000 row which are all bad formated

Step 2: start gpfdist service
gpfdist -d /home/gpadmin/data/ -p 8081 -l /home/gpadmin/log/load.log &
------------------------------------------------------------------------------------------------
[1] 34635
Serving HTTP on port 8081, directory /home/gpadmin/data

Step 3: create external table
CREATE EXTERNAL TABLE test_ext (id INT, a TEXT, b TEXT, c TEXT, z TEXT)
LOCATION ('gpfdist://localhost:8081/test.csv')
FORMAT 'CSV'
LOG ERRORS INTO test_ext_err SEGMENT REJECT LIMIT 3000 ROWS;
-----------------------------------------------------------------------------------------------------
NOTICE:  Error table "test_ext_err" does not exist. Auto generating an error 
table with the same name
CREATE EXTERNAL TABLE

Step 4: access external table
SELECT COUNT(*) FROM test_ext;
-------------------------------------------------
ERROR:  All 1000 first rows in this segment were rejected. Aborting operation 
regardless of REJECT LIMIT value. Last error was: missing data for column "z"  
(seg0 localhost:40000 pid=35647)
DETAIL:  External table test_ext, line 1000 of 
gpfdist://localhost:8081/test.csv: "29,aaa,bbb,zzz"
```

2. Error copying from file with bad rows
```
Step 1: download attached test.csv with 2000 row which are all bad formated

Step 2: create table
CREATE TABLE test_copy (id INT, a TEXT, b TEXT, c TEXT, z TEXT);
------------------------------------------------------------------------------------------------
CREATE TABLE

Step 3: copy data in file to table in database
COPY test_copy FROM '/Users/intern/Downloads/test.csv' LOG ERRORS INTO 
test_copy_err SEGMENT REJECT LIMIT 3000 ROWS;
--------------------------------------------------------------------------------------------------------
NOTICE:  Error table "test_copy_err" does not exist. Auto generating an error 
table with the same name
WARNING:  The error table was created in the same transaction as this 
operation. It will get dropped if transaction rolls back even if bad rows are 
present
HINT:  To avoid this create the error table ahead of time using: CREATE TABLE 
<name> (cmdtime timestamp with time zone, relname text, filename text, linenum 
integer, bytenum integer, errmsg text, rawdata text, rawbytes bytea)
ERROR:  All 1000 first rows in this segment were rejected. Aborting operation 
regardless of REJECT LIMIT value. Last error was: missing data for column "a"
CONTEXT:  COPY test_copy, line 1000: "29,aaa,bbb,zzz"
```



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to