[
https://issues.apache.org/jira/browse/HAWQ-882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15356543#comment-15356543
]
Lili Ma commented on HAWQ-882:
------------------------------
[[email protected]], This seems to be a HDFS bug, neither related to
parquet or row table, nor related to compress type, gzip, snappy, or quicklz.
Since HAWQ parquet table is stored in HDFS, when calling writeFooter, HAWQ will
write the data to HDFS, and from the call stack, we can see it's a bug thrown
in Namenode choose a block to write to.
> insert failed when compresstype=gzip
> ------------------------------------
>
> Key: HAWQ-882
> URL: https://issues.apache.org/jira/browse/HAWQ-882
> Project: Apache HAWQ
> Issue Type: Bug
> Reporter: liuguo
> Assignee: Lei Chang
>
> create sql is :
> create table testzip(a int, b varchar(20))with(appendonly=true,
> orientation=parquet, compresstype=gzip, compresslevel=8);
> Inserting a record at a time is no problem.
> But error inserting 10000 records at one time
> the error is:
> psycopg2.OperationalError: Parquet Storage Write error on segment file
> 'hdfs://kmaster:8020/hawq_default/16385/16535/16569/1': writing footer file
> failure (seg0 kslave1:40000 pid=790454)
> ----------------------------------------------------------------------------------------------------------------
> create sql is:create table testzip1(a int, b
> varchar(20))with(appendonly=true, orientation=parquet, compresstype=snappy);
> when the table's compresstype=snappy , get an error:
> ERROR: Parquet Storage Write error on segment file
> 'hdfs://kmaster:8020/hawq_default/16385/16535/16564/1': writing footer file
> failure (seg0 kslave2:40000 pid=441973)
> ----------------------------------------------------------------------------------------------------------------
> create sql is:create table testzip2(a int, b
> varchar(20))with(appendonly=true, compresstype=quicklz, compresslevel=1);
> when the table's compesstype=quicklz, get an error:
> ERROR: Could not flush (fsync) Append-Only segment file
> 'hdfs://kmaster:8020/hawq_default/16385/16535/16556/1' to disk for relation
> 'testquicklz': Input/output error (seg0 kslave1:40000 pid=4112)
> DETAIL:
> File /hawq_default/16385/16535/16556/1 could only be replicated to 0 nodes
> instead of minReplication (=1). There are 2 datanode(s) running and no
> node(s) are excluded in this operation.
> at
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1559)
> at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3245)
> at
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:663)
> at
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:482)
> at
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
> at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:975)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2040)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2036)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1656)
> my hadoop have 2 nodes, version is 2.7.1.2.4
> my hawq have 2 nodes , version is 2.0.0
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)