Chunling Wang created HAWQ-1149:
-----------------------------------
Summary: Built-in function gp_persistent_build_all loses data in
gp_relfile_node and gp_persistent_relfile_node
Key: HAWQ-1149
URL: https://issues.apache.org/jira/browse/HAWQ-1149
Project: Apache HAWQ
Issue Type: Bug
Components: Core
Reporter: Chunling Wang
Assignee: Lei Chang
When we create a new table, and insert data into it. There will be records in
gp_relfile_node, gp_persistent_relfile_node and gp_persistent_relation_node.
But if we run the HAWQ build-in function gp_persistent_build_all, we will find
that the record in gp_relfile_node and gp_persistent_relfile_node for this
table is lost. And if there are more than 1 file in this talbe, we will get
error when we drop this table. Here are the steps to recur this bug:
1. Create table a, and insert data into a with two concurrent process:
{code}
postgres=# create table a(id int);
CREATE TABLE
postgres=# insert into a select generate_series(1, 10000000);
INSERT 0 10000000
{code}
{code}
postgres=# insert into a select generate_series(10000000, 20000000);
INSERT 0 10000001
{code}
2. Check the persistent table and find two files in this table's directory:
{code}
postgres=# select oid from pg_class where relname='a';
oid
---------
3017232
(1 row)
postgres=# select * from gp_relfile_node where relfilenode_oid=3017232;
relfilenode_oid | segment_file_num | persistent_tid | persistent_serial_num
-----------------+------------------+----------------+-----------------------
3017232 | 1 | (4,128) | 855050
3017232 | 2 | (4,129) | 855051
(2 rows)
postgres=# select * from gp_persistent_relation_node where
relfilenode_oid=3017232;
tablespace_oid | database_oid | relfilenode_oid | persistent_state | reserved
| parent_xid | persistent_serial_num | previous_free_tid
----------------+--------------+-----------------+------------------+----------+------------+-----------------------+-------------------
16385 | 16387 | 3017232 | 2 | 0
| 0 | 158943 | (0,0)
(1 row)
postgres=# select * from gp_persistent_relfile_node where
relfilenode_oid=3017232;
tablespace_oid | database_oid | relfilenode_oid | segment_file_num |
relation_storage_manager | persistent_state | relation_bufpool_kind |
parent_xid | persistent_serial_num | previous_free_tid
----------------+--------------+-----------------+------------------+--------------------------+------------------+-----------------------+------------+-----------------------+-------------------
16385 | 16387 | 3017232 | 1 |
2 | 2 | 0 | 0 |
855050 | (0,0)
16385 | 16387 | 3017232 | 2 |
2 | 2 | 0 | 0 |
855051 | (0,0)
(2 rows)
hadoop fs -ls /hawq_default/16385/16387/3017232
-rw------- 3 wangchunling supergroup 100103584 2016-11-08 17:02
/hawq_default/16385/16387/3017232/1
-rw------- 3 wangchunling supergroup 100103600 2016-11-08 17:02
/hawq_default/16385/16387/3017232/2
{code}
3. Rebuilt persistent tables.
{code}
postgres=# insert into a select generate_series(10000000, 20000000);
INSERT 0 10000001
postgres=# select gp_persistent_reset_all();
gp_persistent_reset_all
-------------------------
1
(1 row)
postgres=# select gp_persistent_build_all(false);
gp_persistent_build_all
-------------------------
1
(1 row)
{code}
4. Check persistent table and find data lost in gp_relfile_node and
gp_persistent_relfile_node.
{code}
postgres=# select * from gp_relfile_node where relfilenode_oid=3017232;
relfilenode_oid | segment_file_num | persistent_tid | persistent_serial_num
-----------------+------------------+----------------+-----------------------
(0 rows)
postgres=# select * from gp_persistent_relation_node where
relfilenode_oid=3017232;
tablespace_oid | database_oid | relfilenode_oid | persistent_state | reserved
| parent_xid | persistent_serial_num | previous_free_tid
----------------+--------------+-----------------+------------------+----------+------------+-----------------------+-------------------
16385 | 16387 | 3017232 | 2 | 0
| 0 | 159020 | (0,0)
(1 row)
postgres=# select * from gp_persistent_relfile_node where
relfilenode_oid=3017232;
tablespace_oid | database_oid | relfilenode_oid | segment_file_num |
relation_storage_manager | persistent_state | relation_bufpool_kind |
parent_xid | persistent_serial_num | previous_free_tid
----------------+--------------+-----------------+------------------+--------------------------+------------------+-----------------------+------------+-----------------------+-------------------
(0 rows)
{code}
5. Drop talbe a and get error.
{code}
postgres=# DROP TABLE a;
ERROR: TID for persistent 'Relation Directory: '16385/16387/3017232'' tuple is
invalid (0,0) (index 2, transaction kind 'Commit') (persistentendxactrec.c:264)
{code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)