[jira] [Closed] (HAWQ-1324) Query cancel cause segment to go into Crash recovery

2017-11-02 Thread Radar Lei (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Radar Lei closed HAWQ-1324.
---
Resolution: Fixed

> Query cancel cause segment to go into Crash recovery
> 
>
> Key: HAWQ-1324
> URL: https://issues.apache.org/jira/browse/HAWQ-1324
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Query Execution
>Affects Versions: 2.0.0.0-incubating
>Reporter: Ming LI
>Assignee: Ming LI
>Priority: Major
>
> A query was cancelled due to this connection issue to HDFS on Isilon. Seg26 
> then went into crash recovery due to a INSERT query being cancelled. What 
> should be the expected behaviour when HDFS becomes unavailable and a Query 
> fails due to HDFS unavailability.
> There was a core file generated at the time of the Crash recovery. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Closed] (HAWQ-1324) Query cancel cause segment to go into Crash recovery

2017-03-05 Thread Ruilong Huo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruilong Huo closed HAWQ-1324.
-

> Query cancel cause segment to go into Crash recovery
> 
>
> Key: HAWQ-1324
> URL: https://issues.apache.org/jira/browse/HAWQ-1324
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Query Execution
>Affects Versions: 2.0.0.0-incubating
>Reporter: Ming LI
>Assignee: Ming LI
> Fix For: 2.1.0.0-incubating
>
>
> A query was cancelled due to this connection issue to HDFS on Isilon. Seg26 
> then went into crash recovery due to a INSERT query being cancelled. What 
> should be the expected behaviour when HDFS becomes unavailable and a Query 
> fails due to HDFS unavailability.
> Below is the HDFS error
> {code}
> 2017-01-04 03:04:08.382615 
> JST,"carund","dwhrun",p574246,th1862944896,"192.168.10.12","47554",2017-01-04 
> 03:03:08 JST,0,con198952,,seg29,"FATAL","08006","connection to client 
> lost",,,0,,"postgres.c",3518,
> 2017-01-04 03:04:08.420099 
> JST,,,p755778,th18629448960,,,seg-1,"LOG","0","3rd party 
> error log:
> 2017-01-04 03:04:08.419969, p574222, th140507423066240, ERROR Handle 
> Exception: NamenodeImpl.cpp: 670: Unexpected error: status: 
> STATUS_FILE_NOT_AVAILABLE = 0xC467 Path: 
> hawq_default/16385/16563/802748/26 with path=
> ""/hawq_default/16385/16563/802748/26"", 
> clientname=libhdfs3_client_random_866998528_count_1_pid_574222_tid_140507423066240
> @ Hdfs::Internal::UnWrapper Hdfs::HdfsIOException, Hdfs::Internal::Nothing, Hdfs::Internal::Nothing, 
> Hdfs::Internal::Nothing, Hdfs::Internal::Nothing, Hdfs::Internal::Nothing , 
> Hdfs::Internal::Nothing, Hdfs::Internal::Nothing, Hdfs::Internal::Nothing, 
> Hdfs::Internal::Nothing>::unwrap(char const, int)
> @ Hdfs::Internal::UnWrapper Hdfs::UnresolvedLinkException, Hdfs::HdfsIOException, 
> Hdfs::Internal::Nothing, Hdfs::Internal::Nothing, Hdfs::Internal::Nothing, 
> Hdfs::Internal::Not hing, Hdfs::Internal::Nothing, Hdfs::Internal::Nothing, 
> Hdfs::Internal::Nothing, Hdfs::Internal::Nothing>::unwrap(char const, int)
> @ Hdfs::Internal::NamenodeImpl::fsync(std::string const&, std::string const&)
> @ Hdfs::Internal::NamenodeProxy::fsync(std::string const&, std::string const&)
> @ Hdfs::Internal::OutputStreamImpl::closePipeline()
> @ Hdfs::Internal::OutputStreamImpl::close()
> @ hdfsCloseFile
> @ gpfs_hdfs_closefile
> @ HdfsCloseFile
> @ HdfsFileClose
> @ CleanupTempFiles
> @ AbortTransaction
> @ AbortCurrentTransaction
> @ PostgresMain
> @ BackendStartup
> @ ServerLoop
> @ PostmasterMain
> @ main
> @ Unknown
> @ Unknown""SysLoggerMain","syslogger.c",518,
> 2017-01-04 03:04:08.420272 
> JST,"carund","dwhrun",p574222,th1862944896,"192.168.10.12","47550",2017-01-04 
> 03:03:08 
> JST,40678725,con198952,cmd4,seg25,,,x40678725,sx1,"WARNING","58030","could 
> not close file 7 : (hdfs://ffd
> lakehd.ffwin.fujifilm.co.jp:8020/hawq_default/16385/16563/802748/26) errno 
> 5","Unexpected error: status: STATUS_FILE_NOT_AVAILABLE = 0xC467 Path: 
> hawq_default/16385/16563/802748/26 with path=""/hawq_default/16385/16
> 563/802748/26"", 
> clientname=libhdfs3_client_random_866998528_count_1_pid_574222_tid_140507423066240",,0,,"fd.c",2762,
> {code}
> Segment 26 going into Crash recovery - from seg26 log file
> {code}
> 2017-01-04 03:04:08.420314 
> JST,"carund","dwhrun",p574222,th1862944896,"192.168.10.12","47550",2017-01-04 
> 03:03:08 
> JST,40678725,con198952,cmd4,seg25,,,x40678725,sx1,"LOG","08006","could not 
> send data to client: 接続が相
> 手からリセットされました",,,0,,"pqcomm.c",1292,
> 2017-01-04 03:04:08.420358 
> JST,"carund","dwhrun",p574222,th1862944896,"192.168.10.12","47550",2017-01-04 
> 03:03:08 JST,0,con198952,,seg25,"LOG","08006","could not send data to 
> client: パイプが切断されました",,,0,
> ,"pqcomm.c",1292,
> 2017-01-04 03:04:08.420375 
> JST,"carund","dwhrun",p574222,th1862944896,"192.168.10.12","47550",2017-01-04 
> 03:03:08 JST,0,con198952,,seg25,"FATAL","08006","connection to client 
> lost",,,0,,"postgres.c",3518,
> 2017-01-04 03:04:08.950354 
> JST,,,p755773,th18629448960,,,seg-1,"LOG","0","server process 
> (PID 574240) was terminated by signal 11: Segmentation 
> fault",,,0,,"postmaster.c",4748,
> 2017-01-04 03:04:08.950403 
> JST,,,p755773,th18629448960,,,seg-1,"LOG","0","terminating 
> any other active server processes",,,0,,"postmaster.c",4486,
> 2017-01-04 03:04:08.954044 
> JST,,,p41605,th18629448960,,,seg-1,"LOG","0","Segment RM 
> exits.",,,0,,"resourcemanager.c",340,
> 2017-01-04 03:04:08.954078 
> JST,,,p41605,th18629448960,,,seg-1,"LOG","0","Clean up 
> handler in message server is