[jira] [Created] (HAWQ-1552) hawq does not support hdfs storage policy?
lynn created HAWQ-1552: -- Summary: hawq does not support hdfs storage policy? Key: HAWQ-1552 URL: https://issues.apache.org/jira/browse/HAWQ-1552 Project: Apache HAWQ Issue Type: Bug Components: libhdfs Reporter: lynn Assignee: Radar Lei 1.The '/ssd" path on HDFS set storage policy ALL_SSD. hdfs storagepolicies -setStoragePolicy -path /ssd -policy ALL_SSD 2. check [hdfs@master1 ~]$ hdfs storagepolicies -getStoragePolicy -path /ssd The storage policy of /ssd: BlockStoragePolicy{ALL_SSD:12, storageTypes=[SSD], creationFallbacks=[DISK], replicationFallbacks=[DISK]} 3. put file to hdfs hdfs dfs -put dd.txt /ssd/fs_ssd 4.check block location [hdfs@master1 ~]$ hdfs fsck /ssd/fs_ssd/dd.txt -blocks -locations -files decommissioned replica(s) and 0 decommissioning replica(s). 0. BP-845848702-192.168.1.130-1496396138316:blk_1075677761_7587369 len=7 repl=3 [DatanodeInfoWithStorage[192.168.1.133:50010,DS-1510d4e4-cfdb-4184-8f47-7417b91f4f5c,{color:red}SSD{color}], DatanodeInfoWithStorage[192.168.1.132:50010,DS-7d498d01-8242-4621-8901-fe397a8196c3,{color:red}SSD{color}], DatanodeInfoWithStorage[192.168.1.134:50010,DS-37c4e804-1b2a-4156-a54c-cecc8393bb09,{color:red}SSD{color}]] 5.hawq create filespace fs_ssd and tablespace ts_ssd, fs_ssd point to /ssd/fs_ssd path 6.psql create table create table p(i int)with(appendonly=true,orientation=parquet,compresstype=snappy) tablespace fs_ssd; 7. psql insert data insert into p values(1); 8. query the file on hdfs select c.relname, d.dat2tablespace tablespace_id, d.oid database_id, c.relfilenode table_id from pg_database d, pg_class c, pg_namespace n where c.relnamespace = n.oid and d.datname = current_database() and c.relname = 'p'; relname | tablespace_id | database_id | table_id -+---+-+-- p | 1021474 | 1021475 | 1037187 9. check the file of table "p" locations [hdfs@master1 ~]$ hdfs fsck /ssd/fs_ssd/1021474/1021475/1037187/1 -blocks -locations -files Connecting to namenode via http://master1.bigdata:50070/fsck?ugi=hdfs=1=1=1=%2Fssd%2Ffs_ssd%2F1021474%2F1021475%2F1037187%2F1 FSCK started by hdfs (auth:SIMPLE) from /192.168.1.130 for path /ssd/fs_ssd/1021474/1021475/1037187/1 at Fri Nov 17 17:26:17 CST 2017 /ssd/fs_ssd/1021474/1021475/1037187/1 188 bytes, 1 block(s): OK 0. BP-845848702-192.168.1.130-1496396138316:blk_1075677763_7587371 len=188 repl=3 [DatanodeInfoWithStorage[192.168.1.134:50010,DS-4be28698-6ebd-4ae0-a515-f3fb5e1293ab,{color:red}DISK{color}], DatanodeInfoWithStorage[192.168.1.133:50010,DS-99d56cac-5af0-483d-b93f-a1bbae038934,{color:red}DISK{color}], DatanodeInfoWithStorage[192.168.1.132:50010,DS-22c09ee4-49ac-47ed-a592-4f0e84776086,{color:red}DISK{color}]] The ALL_SSD storage policy doesn't work!!! -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HAWQ-1546) The data files on hdfs after hawq load data were too large!!!
lynn created HAWQ-1546: -- Summary: The data files on hdfs after hawq load data were too large!!! Key: HAWQ-1546 URL: https://issues.apache.org/jira/browse/HAWQ-1546 Project: Apache HAWQ Issue Type: Bug Components: libhdfs Reporter: lynn Assignee: Radar Lei create table person_l1 (id int, name varchar(20), age int, sex char(1))with(appendonly=true,orientation=parquet,compresstype=snappy); create table person_l2 (id int, name varchar(20), age int, sex char(1))with(appendonly=true,orientation=parquet,compresstype=snappy); 执行480次插入语句: sh insert.sh 执行1次插入语句: 1920条数据 psql -d test -f i1.sql select sotdtablename, sotdsize from hawq_toolkit.hawq_size_of_table_disk where sotdtablename like 'person_%'; -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HAWQ-1513) "string_agg" function does not support query optimization on partitioned table
lynn created HAWQ-1513: -- Summary: "string_agg" function does not support query optimization on partitioned table Key: HAWQ-1513 URL: https://issues.apache.org/jira/browse/HAWQ-1513 Project: Apache HAWQ Issue Type: Test Components: Catalog Reporter: lynn Assignee: Radar Lei SELECT mid, COUNT (*), string_agg ( create_time ,|| '#' || s_id ORDER BY create_time ) FROM t1 WHERE t1.create_time BETWEEN to_timestamp( '2016-12-19 00:20:00:770', '-MM-dd HH24:MI:ss.ff' ) AND to_timestamp( '2016-12-19 23:40:00:770', '-MM-dd HH24:MI:ss.ff' )group by mid; when we explain the sql statement, we find it scan all the partition of table t1 in the query plan, and it has a huge impact on the query performance. what can i do to solve this problem? -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HAWQ-1437) Hadoop enabled Kerberos,HAWQ occurs errors
lynn created HAWQ-1437: -- Summary: Hadoop enabled Kerberos,HAWQ occurs errors Key: HAWQ-1437 URL: https://issues.apache.org/jira/browse/HAWQ-1437 Project: Apache HAWQ Issue Type: Wish Components: PXF Reporter: lynn Assignee: Ed Espino Fix For: 2.2.0.0-incubating When kerberos enabled with Hadoop , HAWQ create external table from hdfs need principal information, what should I do can create it success? postgres=# postgres=# postgres=# CREATE EXTERNAL TABLE pxf_hdfs_textsimple(location text, month text, num_orders int, total_sales float8) postgres-# LOCATION ('pxf:///mycluster/data/pxf_examples/pxf_hdfs_simple.txt?PROFILE=HdfsTextSimple') postgres-# FORMAT 'TEXT' (delimiter=E','); ERROR: Invalid URI pxf:///mycluster/data/pxf_examples/pxf_hdfs_simple.txt?PROFILE=HdfsTextSimple : missing authority section -- This message was sent by Atlassian JIRA (v6.3.15#6346)