[jira] [Updated] (HAWQ-1339) Cache lookup failed after explain OLAP grouping query
[ https://issues.apache.org/jira/browse/HAWQ-1339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Amy updated HAWQ-1339: -- Fix Version/s: (was: 2.2.0.0-incubating) backlog > Cache lookup failed after explain OLAP grouping query > - > > Key: HAWQ-1339 > URL: https://issues.apache.org/jira/browse/HAWQ-1339 > Project: Apache HAWQ > Issue Type: Bug > Components: Catalog >Reporter: Amy >Assignee: Amy > Fix For: backlog > > Attachments: olap_setup.sql > > > Some OLAP grouping query may error out with "division by zero", and when do > query explain, notice of "cache lookup failed for attribute 7 of relation > 75036 (lsyscache.c:437)" occurred. > {code} > postgres=# SELECT sale.vn,sale.cn,sale.dt,GROUPING(sale.vn), > TO_CHAR(COALESCE(MAX(DISTINCT > floor(sale.vn+sale.qty)),0),'.999'),TO_CHAR(COALESCE(VAR_SAMP(floor(sale.pn/sale.prc)),0),'.999'),TO_CHAR(COALESCE(COUNT(floor(sale.qty+sale.prc)),0),'.999') > postgres-# FROM sale,customer,vendor > postgres-# WHERE sale.cn=customer.cn AND sale.vn=vendor.vn > postgres-# GROUP BY > ROLLUP((sale.prc),(sale.vn,sale.vn),(sale.pn,sale.pn),(sale.dt),(sale.qty,sale.vn,sale.qty)),ROLLUP((sale.pn),(sale.vn,sale.pn),(sale.qty)),(),sale.cn > HAVING COALESCE(VAR_POP(sale.cn),0) >= 45.5839785564113; > ERROR: division by zero (seg0 localhost:4 pid=25205) > postgres=# > postgres=# explain SELECT sale.vn,sale.cn,sale.dt,GROUPING(sale.vn), > TO_CHAR(COALESCE(MAX(DISTINCT > floor(sale.vn+sale.qty)),0),'.999'),TO_CHAR(COALESCE(VAR_SAMP(floor(sale.pn/sale.prc)),0),'.999'),TO_CHAR(COALESCE(COUNT(floor(sale.qty+sale.prc)),0),'.999') > FROM sale,customer,vendor > WHERE sale.cn=customer.cn AND sale.vn=vendor.vn > GROUP BY > ROLLUP((sale.prc),(sale.vn,sale.vn),(sale.pn,sale.pn),(sale.dt),(sale.qty,sale.vn,sale.qty)),ROLLUP((sale.pn),(sale.vn,sale.pn),(sale.qty)),(),sale.cn > HAVING COALESCE(VAR_POP(sale.cn),0) >= 45.5839785564113; > NOTICE: cache lookup failed for attribute 7 of relation 75036 > (lsyscache.c:437) > {code} > The reproduction steps are: > {code} > Step 1: Prepare schema and data using attached olap_setup.sql > Step 2: Run below OLAP grouping query > -- OLAP query involving MAX() function > SELECT sale.vn,sale.cn,sale.dt,GROUPING(sale.vn), > TO_CHAR(COALESCE(MAX(DISTINCT > floor(sale.vn+sale.qty)),0),'.999'),TO_CHAR(COALESCE(VAR_SAMP(floor(sale.pn/sale.prc)),0),'.999'),TO_CHAR(COALESCE(COUNT(floor(sale.qty+sale.prc)),0),'.999') > FROM sale,customer,vendor > WHERE sale.cn=customer.cn AND sale.vn=vendor.vn > GROUP BY > ROLLUP((sale.prc),(sale.vn,sale.vn),(sale.pn,sale.pn),(sale.dt),(sale.qty,sale.vn,sale.qty)),ROLLUP((sale.pn),(sale.vn,sale.pn),(sale.qty)),(),sale.cn > HAVING COALESCE(VAR_POP(sale.cn),0) >= 45.5839785564113; > explain SELECT sale.vn,sale.cn,sale.dt,GROUPING(sale.vn), > TO_CHAR(COALESCE(MAX(DISTINCT > floor(sale.vn+sale.qty)),0),'.999'),TO_CHAR(COALESCE(VAR_SAMP(floor(sale.pn/sale.prc)),0),'.999'),TO_CHAR(COALESCE(COUNT(floor(sale.qty+sale.prc)),0),'.999') > FROM sale,customer,vendor > WHERE sale.cn=customer.cn AND sale.vn=vendor.vn > GROUP BY > ROLLUP((sale.prc),(sale.vn,sale.vn),(sale.pn,sale.pn),(sale.dt),(sale.qty,sale.vn,sale.qty)),ROLLUP((sale.pn),(sale.vn,sale.pn),(sale.qty)),(),sale.cn > HAVING COALESCE(VAR_POP(sale.cn),0) >= 45.5839785564113; > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HAWQ-1342) QE process hang in shared input scan on segment node
[ https://issues.apache.org/jira/browse/HAWQ-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming LI updated HAWQ-1342: -- Affects Version/s: 2.0.0.0-incubating > QE process hang in shared input scan on segment node > > > Key: HAWQ-1342 > URL: https://issues.apache.org/jira/browse/HAWQ-1342 > Project: Apache HAWQ > Issue Type: Bug > Components: Query Execution >Affects Versions: 2.0.0.0-incubating >Reporter: Amy >Assignee: Amy > Fix For: backlog > > > QE process hang on some segment node while QD and QE on other segment nodes > terminated. > {code} > [gpadmin@test1 ~]$ cat hostfile > test1 master secondary namenode > test2 segment datanode > test3 segment datanode > test4 segment datanode > test5 segment namenode > [gpadmin@test3 ~]$ ps -ef | grep postgres | grep -v grep > gpadmin 41877 1 0 05:35 ?00:01:04 > /usr/local/hawq_2_1_0_0/bin/postgres -D > /data/pulse-agent-data/HAWQ-main-FeatureTest-opt-Multinode-parallel/product/segmentdd > -i -M segment -p 20100 --silent-mode=true > gpadmin 41878 41877 0 05:35 ?00:00:02 postgres: port 20100, > logger process > gpadmin 41881 41877 0 05:35 ?00:00:00 postgres: port 20100, stats > collector process > gpadmin 41882 41877 0 05:35 ?00:00:07 postgres: port 20100, > writer process > gpadmin 41883 41877 0 05:35 ?00:00:01 postgres: port 20100, > checkpoint process > gpadmin 41884 41877 0 05:35 ?00:00:11 postgres: port 20100, > segment resource manager > gpadmin 42108 41877 0 05:35 ?00:00:03 postgres: port 20100, > hawqsuperuser olap_group 10.32.35.192(65193) con35 seg0 cmd2 slice9 MPPEXEC > SELECT > gpadmin 42416 41877 0 05:35 ?00:00:03 postgres: port 20100, > hawqsuperuser olap_group 10.32.35.192(65359) con53 seg0 cmd2 slice11 MPPEXEC > SELECT > gpadmin 44807 41877 0 05:36 ?00:00:03 postgres: port 20100, > hawqsuperuser olap_group 10.32.35.192(2272) con183 seg0 cmd2 slice31 MPPEXEC > SELECT > gpadmin 44819 41877 0 05:36 ?00:00:03 postgres: port 20100, > hawqsuperuser olap_group 10.32.35.192(2278) con183 seg0 cmd2 slice10 MPPEXEC > SELECT > gpadmin 44821 41877 0 05:36 ?00:00:03 postgres: port 20100, > hawqsuperuser olap_group 10.32.35.192(2279) con183 seg0 cmd2 slice25 MPPEXEC > SELECT > gpadmin 45447 41877 0 05:36 ?00:00:03 postgres: port 20100, > hawqsuperuser olap_group 10.32.35.192(2605) con207 seg0 cmd2 slice9 MPPEXEC > SELECT > gpadmin 49859 41877 0 05:38 ?00:00:03 postgres: port 20100, > hawqsuperuser olap_group 10.32.35.192(4805) con432 seg0 cmd2 slice20 MPPEXEC > SELECT > gpadmin 49881 41877 0 05:38 ?00:00:03 postgres: port 20100, > hawqsuperuser olap_group 10.32.35.192(4816) con432 seg0 cmd2 slice7 MPPEXEC > SELECT > gpadmin 51937 41877 0 05:39 ?00:00:03 postgres: port 20100, > hawqsuperuser olap_group 10.32.35.192(5877) con517 seg0 cmd2 slice7 MPPEXEC > SELECT > gpadmin 51939 41877 0 05:39 ?00:00:03 postgres: port 20100, > hawqsuperuser olap_group 10.32.35.192(5878) con517 seg0 cmd2 slice9 MPPEXEC > SELECT > gpadmin 51941 41877 0 05:39 ?00:00:03 postgres: port 20100, > hawqsuperuser olap_group 10.32.35.192(5879) con517 seg0 cmd2 slice11 MPPEXEC > SELECT > gpadmin 51943 41877 0 05:39 ?00:00:03 postgres: port 20100, > hawqsuperuser olap_group 10.32.35.192(5880) con517 seg0 cmd2 slice13 MPPEXEC > SELECT > gpadmin 51953 41877 0 05:39 ?00:00:03 postgres: port 20100, > hawqsuperuser olap_group 10.32.35.192(5885) con517 seg0 cmd2 slice26 MPPEXEC > SELECT > gpadmin 53436 41877 0 05:40 ?00:00:03 postgres: port 20100, > hawqsuperuser olap_group 10.32.35.192(6634) con602 seg0 cmd2 slice15 MPPEXEC > SELECT > gpadmin 57095 41877 0 05:41 ?00:00:03 postgres: port 20100, > hawqsuperuser olap_group 10.32.35.192(8450) con782 seg0 cmd2 slice10 MPPEXEC > SELECT > gpadmin 57097 41877 0 05:41 ?00:00:04 postgres: port 20100, > hawqsuperuser olap_group 10.32.35.192(8451) con782 seg0 cmd2 slice11 MPPEXEC > SELECT > gpadmin 63159 41877 0 05:43 ?00:00:03 postgres: port 20100, > hawqsuperuser olap_group 10.32.35.192(11474) con1082 seg0 cmd2 slice15 > MPPEXEC SELECT > gpadmin 64018 41877 0 05:44 ?00:00:03 postgres: port 20100, > hawqsuperuser olap_group 10.32.35.192(11905) con1121 seg0 cmd2 slice5 MPPEXEC > SELECT > {code} > The stack info is as below and it seems that QE hang in shared input scan. > {code} > [gpadmin@test3 ~]$ gdb -p 42108 > (gdb) info threads > 2 Thread 0x7f4f6b335700 (LWP 42109) 0x0032214df283 in poll () from > /lib64/libc.so.6 > * 1 Thread 0x7f4f9041c920 (LWP 42108) 0x0032214e1523 in select
[jira] [Updated] (HAWQ-1345) Cannot connect to PSQL: FATAL: could not count blocks of relation 1663/16508/1249: Not a directory
[ https://issues.apache.org/jira/browse/HAWQ-1345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming LI updated HAWQ-1345: -- Affects Version/s: 2.0.0.0-incubating > Cannot connect to PSQL: FATAL: could not count blocks of relation > 1663/16508/1249: Not a directory > -- > > Key: HAWQ-1345 > URL: https://issues.apache.org/jira/browse/HAWQ-1345 > Project: Apache HAWQ > Issue Type: Bug > Components: Catalog >Affects Versions: 2.0.0.0-incubating >Reporter: Amy >Assignee: Ming LI > Fix For: backlog > > > Unable to connect to psql for current database. > We can access psql for template1 database but for current database we are > getting the following error: > {code} > #psql > psql: FATAL: could not count blocks of relation 1663/16508/1249: Not a > directory > {code} > When trying to failover to Standby and starting HAWQ Master we get the > following error again: > {code} > 2017-02-17 02:12:50.119207 > PST,,,p22482,th-16818971840,,,seg-1,"DEBUG1","0","opening > ""pg_xlog/00010005001D"" for readin > g (log 5, seg 29)",,,0,,"xlog.c",3162, > 2017-02-17 02:12:50.176450 > PST,,,p22482,th-16818971840,,,seg-1,"FATAL","42809","could not > count blocks of relation 1663/16508/1249: Not > a directory","xlog redo insert: rel 1663/16508/1249; tid 32682/85 > REDO PASS 3 @ 5/7669B838; LSN 5/7669E480: prev 5/76694C98; xid 825193; bkpb1: > Heap - insert: rel 1663/16508/1249; tid 32682/85",,0,,"smgr.c",1146," > Stack trace: > 10x8c5628 postgres errstart + 0x288 > 20x7ddfbc postgres smgrnblocks + 0x3c > 30x4fbdf8 postgres XLogReadBuffer + 0x18 > 40x4ea2c9 postgres + 0x4ea2c9 > 50x4eaf47 postgres + 0x4eaf47 > 60x4f8af3 postgres StartupXLOG_Pass3 + 0x153 > 70x4fb277 postgres StartupProcessMain + 0x187 > 80x557cd8 postgres AuxiliaryProcessMain + 0x478 > 90x793c40 postgres + 0x793c40 > 10 0x798901 postgres + 0x798901 > 11 0x79a8c9 postgres PostmasterMain + 0x759 > 12 0x4a4039 postgres main + 0x519 > 13 0x7f3b979e1d5d libc.so.6 __libc_start_main + 0xfd > 14 0x4a40b9 postgres + 0x4a40b9 > " > {code} > On both Master and Standby, we can see that pg_attribute for current > database, file 1663/16508/1249 has reached 1GB in size: > {code} > [gpadmin@master]$pwd > /data/hawq/master > [gpadmin@master master]$ cd base > [gpadmin@master base]$ ls > 1 16386 16387 16508 > [gpadmin@master base]$ cd 16508 > [gpadmin@master 16508]$ ls -thrl 1249 > -rw--- 1 gpadmin gpadmin 1.0G Feb 16 18:24 1249 > {code} > From strace we were able to find the following: > {code} > [gpadmin@master master]$ strace /usr/local/hawq/bin/postgres --single -P -O > -p 5432 -D $MASTER_DATA_DIRECTORY -c gp_session_role=utility currentdatabase > < select version(); > EOF > (...) > open("base/16508/pg_internal.init", O_RDONLY) = -1 ENOENT (No such file or > directory) > open("base/16508/1259", O_RDWR) = 6 > lseek(6, 0, SEEK_END) = 188645376 > lseek(6, 0, SEEK_SET) = 0 > read(6, > "\0\0\0\0\340\5\327\1\1\0\1\0\f\3@\3\0\200\4\2008\263P\1`\262\252\1\270\261P\1"..., > 32768) = 32768 > open("base/16508/1249", O_RDWR) = 8 > lseek(8, 0, SEEK_END) = 1073741824 > open("base/16508/1249/1", O_RDWR) = -1 ENOTDIR (Not a directory) > open("base/16508/1249/1", O_RDWR|O_CREAT, 0600) = -1 ENOTDIR (Not a directory) > futex(0x7ff80e53f620, FUTEX_WAKE_PRIVATE, 2147483647) = 0 > futex(0x7ff80e756af0, FUTEX_WAKE_PRIVATE, 2147483647) = 0 > open("/usr/share/locale/locale.alias", O_RDONLY) = 10 > fstat(10, {st_mode=S_IFREG|0644, st_size=2512, ...}) = 0 > {code} > We see HAWQ is treating pg_attribute as a directory while it is a file. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HAWQ-1346) If using WebHdfsFileSystem as default Filesytem, it will cause cast type exception
Tian Hong Wang created HAWQ-1346: Summary: If using WebHdfsFileSystem as default Filesytem, it will cause cast type exception Key: HAWQ-1346 URL: https://issues.apache.org/jira/browse/HAWQ-1346 Project: Apache HAWQ Issue Type: Bug Components: PXF Reporter: Tian Hong Wang Assignee: Ed Espino In incubator-hawq/pxf/pxf-hdfs/src/main/java/org/apache/hawq/pxf/plugins/hdfs/ChunkRecordReader.java: private DFSInputStream getInputStream() { return (DFSInputStream) (fileIn.getWrappedStream()); } If using WebHdfsFileSystem as default Filesytem, it will cause cast type exception; java.lang.ClassCastException: org.apache.hadoop.hdfs.web.WebHdfsFileSystem$WebHdfsInputStream cannot be cast to org.apache.hadoop.hdfs.DFSInputStream at org.apache.hawq.pxf.plugins.hdfs.ChunkRecordReader.getInputStream(ChunkRecordReader.java:76) at org.apache.hawq.pxf.plugins.hdfs.ChunkRecordReader.(ChunkRecordReader.java:112) at org.apache.hawq.pxf.plugins.hdfs.LineBreakAccessor.getReader(LineBreakAccessor.java:64) at org.apache.hawq.pxf.plugins.hdfs.HdfsSplittableDataAccessor.getNextSplit(HdfsSplittableDataAccessor.java:114) at org.apache.hawq.pxf.plugins.hdfs.HdfsSplittableDataAccessor.openForRead(HdfsSplittableDataAccessor.java:83) at org.apache.hawq.pxf.service.ReadBridge.beginIteration(ReadBridge.java:73) at org.apache.hawq.pxf.service.rest.BridgeResource$1.write(BridgeResource.java:132) at com.sun.jersey.core.impl.provider.entity.StreamingOutputProvider.writeTo(StreamingOutputProvider.java:71) at com.sun.jersey.core.impl.provider.entity.StreamingOutputProvider.writeTo(StreamingOutputProvider.java:57) at com.sun.jersey.spi.container.ContainerResponse.write(ContainerResponse.java:306) at com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1437) at com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1349) at com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1339) at com.sun.jersey.spi.container.servlet.WebComponent.service(WebComponent.java:416) at com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:537) at com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:699) at javax.servlet.http.HttpServlet.service(HttpServlet.java:731) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:303) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208) at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:52) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:220) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:122) at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:505) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:170) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:103) -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (HAWQ-1345) Cannot connect to PSQL: FATAL: could not count blocks of relation 1663/16508/1249: Not a directory
[ https://issues.apache.org/jira/browse/HAWQ-1345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming LI reassigned HAWQ-1345: - Assignee: Ming LI (was: Amy) > Cannot connect to PSQL: FATAL: could not count blocks of relation > 1663/16508/1249: Not a directory > -- > > Key: HAWQ-1345 > URL: https://issues.apache.org/jira/browse/HAWQ-1345 > Project: Apache HAWQ > Issue Type: Bug > Components: Catalog >Reporter: Amy >Assignee: Ming LI > Fix For: 3.0.0.0 > > > Unable to connect to psql for current database. > We can access psql for template1 database but for current database we are > getting the following error: > {code} > #psql > psql: FATAL: could not count blocks of relation 1663/16508/1249: Not a > directory > {code} > When trying to failover to Standby and starting HAWQ Master we get the > following error again: > {code} > 2017-02-17 02:12:50.119207 > PST,,,p22482,th-16818971840,,,seg-1,"DEBUG1","0","opening > ""pg_xlog/00010005001D"" for readin > g (log 5, seg 29)",,,0,,"xlog.c",3162, > 2017-02-17 02:12:50.176450 > PST,,,p22482,th-16818971840,,,seg-1,"FATAL","42809","could not > count blocks of relation 1663/16508/1249: Not > a directory","xlog redo insert: rel 1663/16508/1249; tid 32682/85 > REDO PASS 3 @ 5/7669B838; LSN 5/7669E480: prev 5/76694C98; xid 825193; bkpb1: > Heap - insert: rel 1663/16508/1249; tid 32682/85",,0,,"smgr.c",1146," > Stack trace: > 10x8c5628 postgres errstart + 0x288 > 20x7ddfbc postgres smgrnblocks + 0x3c > 30x4fbdf8 postgres XLogReadBuffer + 0x18 > 40x4ea2c9 postgres + 0x4ea2c9 > 50x4eaf47 postgres + 0x4eaf47 > 60x4f8af3 postgres StartupXLOG_Pass3 + 0x153 > 70x4fb277 postgres StartupProcessMain + 0x187 > 80x557cd8 postgres AuxiliaryProcessMain + 0x478 > 90x793c40 postgres + 0x793c40 > 10 0x798901 postgres + 0x798901 > 11 0x79a8c9 postgres PostmasterMain + 0x759 > 12 0x4a4039 postgres main + 0x519 > 13 0x7f3b979e1d5d libc.so.6 __libc_start_main + 0xfd > 14 0x4a40b9 postgres + 0x4a40b9 > " > {code} > On both Master and Standby, we can see that pg_attribute for current > database, file 1663/16508/1249 has reached 1GB in size: > {code} > [gpadmin@master]$pwd > /data/hawq/master > [gpadmin@master master]$ cd base > [gpadmin@master base]$ ls > 1 16386 16387 16508 > [gpadmin@master base]$ cd 16508 > [gpadmin@master 16508]$ ls -thrl 1249 > -rw--- 1 gpadmin gpadmin 1.0G Feb 16 18:24 1249 > {code} > From strace we were able to find the following: > {code} > [gpadmin@master master]$ strace /usr/local/hawq/bin/postgres --single -P -O > -p 5432 -D $MASTER_DATA_DIRECTORY -c gp_session_role=utility currentdatabase > < select version(); > EOF > (...) > open("base/16508/pg_internal.init", O_RDONLY) = -1 ENOENT (No such file or > directory) > open("base/16508/1259", O_RDWR) = 6 > lseek(6, 0, SEEK_END) = 188645376 > lseek(6, 0, SEEK_SET) = 0 > read(6, > "\0\0\0\0\340\5\327\1\1\0\1\0\f\3@\3\0\200\4\2008\263P\1`\262\252\1\270\261P\1"..., > 32768) = 32768 > open("base/16508/1249", O_RDWR) = 8 > lseek(8, 0, SEEK_END) = 1073741824 > open("base/16508/1249/1", O_RDWR) = -1 ENOTDIR (Not a directory) > open("base/16508/1249/1", O_RDWR|O_CREAT, 0600) = -1 ENOTDIR (Not a directory) > futex(0x7ff80e53f620, FUTEX_WAKE_PRIVATE, 2147483647) = 0 > futex(0x7ff80e756af0, FUTEX_WAKE_PRIVATE, 2147483647) = 0 > open("/usr/share/locale/locale.alias", O_RDONLY) = 10 > fstat(10, {st_mode=S_IFREG|0644, st_size=2512, ...}) = 0 > {code} > We see HAWQ is treating pg_attribute as a directory while it is a file. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HAWQ-1345) Cannot connect to PSQL: FATAL: could not count blocks of relation 1663/16508/1249: Not a directory
Amy created HAWQ-1345: - Summary: Cannot connect to PSQL: FATAL: could not count blocks of relation 1663/16508/1249: Not a directory Key: HAWQ-1345 URL: https://issues.apache.org/jira/browse/HAWQ-1345 Project: Apache HAWQ Issue Type: Bug Components: Catalog Reporter: Amy Assignee: Ed Espino Fix For: 3.0.0.0 Unable to connect to psql for current database. We can access psql for template1 database but for current database we are getting the following error: {code} #psql psql: FATAL: could not count blocks of relation 1663/16508/1249: Not a directory {code} When trying to failover to Standby and starting HAWQ Master we get the following error again: {code} 2017-02-17 02:12:50.119207 PST,,,p22482,th-16818971840,,,seg-1,"DEBUG1","0","opening ""pg_xlog/00010005001D"" for readin g (log 5, seg 29)",,,0,,"xlog.c",3162, 2017-02-17 02:12:50.176450 PST,,,p22482,th-16818971840,,,seg-1,"FATAL","42809","could not count blocks of relation 1663/16508/1249: Not a directory","xlog redo insert: rel 1663/16508/1249; tid 32682/85 REDO PASS 3 @ 5/7669B838; LSN 5/7669E480: prev 5/76694C98; xid 825193; bkpb1: Heap - insert: rel 1663/16508/1249; tid 32682/85",,0,,"smgr.c",1146," Stack trace: 10x8c5628 postgres errstart + 0x288 20x7ddfbc postgres smgrnblocks + 0x3c 30x4fbdf8 postgres XLogReadBuffer + 0x18 40x4ea2c9 postgres + 0x4ea2c9 50x4eaf47 postgres + 0x4eaf47 60x4f8af3 postgres StartupXLOG_Pass3 + 0x153 70x4fb277 postgres StartupProcessMain + 0x187 80x557cd8 postgres AuxiliaryProcessMain + 0x478 90x793c40 postgres + 0x793c40 10 0x798901 postgres + 0x798901 11 0x79a8c9 postgres PostmasterMain + 0x759 12 0x4a4039 postgres main + 0x519 13 0x7f3b979e1d5d libc.so.6 __libc_start_main + 0xfd 14 0x4a40b9 postgres + 0x4a40b9 " {code} On both Master and Standby, we can see that pg_attribute for current database, file 1663/16508/1249 has reached 1GB in size: {code} [gpadmin@master]$pwd /data/hawq/master [gpadmin@master master]$ cd base [gpadmin@master base]$ ls 1 16386 16387 16508 [gpadmin@master base]$ cd 16508 [gpadmin@master 16508]$ ls -thrl 1249 -rw--- 1 gpadmin gpadmin 1.0G Feb 16 18:24 1249 {code} >From strace we were able to find the following: {code} [gpadmin@master master]$ strace /usr/local/hawq/bin/postgres --single -P -O -p 5432 -D $MASTER_DATA_DIRECTORY -c gp_session_role=utility currentdatabase <
[GitHub] incubator-hawq issue #1119: HAWQ-1328. Add deny and exclude policy template ...
Github user zhangh43 commented on the issue: https://github.com/apache/incubator-hawq/pull/1119 Thanks @denalex , the quote typo is fixed. According to the Ranger document, Ranger will traverse all the deny conditions firstly to determine whether to disallow the access. And then traverse all the positive conditions. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---