[
https://issues.apache.org/jira/browse/HAWQ-323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15086730#comment-15086730
]
zharui commented on HAWQ-323:
-----------------------------
Just now, I found there are many core dump files in segmentdd directory, I
guess once a heartbeat is sent a core dump file is generated. Then I ran gdb
with one of the core dump files and I found the segmentation fault is happened
in snedIMAlive function. The calling stack information as follows
#0 0x00000000008d3c61 in sendIMAlive ()
#1 0x0000000000903e20 in MainHandlerLoop_RMSEG ()
#2 0x000000000090412d in ResManagerMainSegment2ndPhase ()
#3 0x00000000009090de in ResManagerMain ()
#4 0x0000000000909511 in ResManagerProcessStartup ()
#5 0x00000000007654d8 in CommenceNormalOperations ()
#6 0x0000000000766284 in do_reaper ()
#7 0x000000000076b2be in ServerLoop ()
#8 0x000000000076ca36 in PostmasterMain ()
#9 0x00000000006c5dda in main ()
> Cannot query when cluster include more than 1 segment
> -----------------------------------------------------
>
> Key: HAWQ-323
> URL: https://issues.apache.org/jira/browse/HAWQ-323
> Project: Apache HAWQ
> Issue Type: Bug
> Components: Core, Resource Manager
> Affects Versions: 2.0.0-beta-incubating
> Reporter: zharui
> Assignee: Lei Chang
>
> The version I use is 2.0.0-beta-RC2. I can query data normally when cluster
> just have 1 segment. Once the cluster have more then 1 segments online, I
> cannot finish any query and being informed that "ERROR: failed to acquire
> resource from resource manager, 7 of 8 segments are unavailable
> (pquery.c:788)".
> I have read the segment logs and the source code about resource manager. I
> guess this issue is because of the communication failure between segment
> instance and resource manager server. I can find the logs of the segment
> connect to resource manager successfully such as "AsyncComm framework
> receives message 518 from FD5" and "Resource enforcer increases memory quota
> to: total memory quota=65536 MB, delta memory quota = 65536 MB", but the
> other online segments have no these log.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)