[
https://issues.apache.org/jira/browse/GEODE-8436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Blake Bender reopened GEODE-8436:
---------------------------------
[~alberto.bustamante.reyes] this is causing a failure in the test
`testThinClientPoolExecuteHAFunction` on RedHat (RHEL7 & RHEL8 both fail). Per
our policy I've reverted the change while we investigate. If you have access
to a RHEL machine, you're welcome to try and track things down. I will
investigate here as time permits. What I see consistently in our output logs
is this:
{quote}[error 2020/08/20 17:09:00.313552 UTC
heavy-lifter-ae0a174c-1be5-522e-8b3f-b521b672e4d1:104377 139991808305216]
Execute: An exception (org.apache.geode.cache.execute.FunctionException:
org.apache.geode.internal.cache.execute.InternalFunctionInvocationTargetException:
memberDeparted event for <
heavy-lifter-ae0a174c-1be5-522e-8b3f-b521b672e4d1(GFECS24955:104622)<ec><v1>:41001
> crashed, false
at
org.apache.geode.internal.cache.partitioned.PRFunctionStreamingResultCollector.getResultInternal(PRFunctionStreamingResultCollector.java:115)
at
org.apache.geode.internal.cache.execute.ResultCollectorHolder.getResult(ResultCollectorHolder.java:53)
at
org.apache.geode.internal.cache.partitioned.PRFunctionStreamingResultCollector.getResult(PRFunctionStreamingResultCollector.java:88)
at
org.apache.geode.internal.cache.tier.sockets.command.ExecuteRegionFunction66.executeFunctionWithResult(ExecuteRegionFunction66.java:406)
at
org.apache.geode.internal.cache.tier.sockets.command.ExecuteRegionFunction66.cmdExecute(ExecuteRegionFunction66.java:201)
at
org.apache.geode.internal.cache.tier.sockets.BaseCommand.execute(BaseCommand.java:183)
at
org.apache.geode.internal.cache.tier.sockets.ServerConnection.doNormalMessage(ServerConnection.java:848)
at
org.apache.geode.internal.cache.tier.sockets.OriginalServerConnection.doOneMessage(OriginalServerConnection.java:72)
at
org.apache.geode.internal.cache.tier.sockets.ServerConnection.run(ServerConnection.java:1212)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at
org.apache.geode.internal.cache.tier.sockets.AcceptorImpl.lambda$initializeServerConnectionThreadPool$3(AcceptorImpl.java:676)
at
org.apache.geode.logging.internal.executors.LoggingThreadFactory.lambda$newThread$0(LoggingThreadFactory.java:119)
at java.lang.Thread.run(Thread.java:748)
Caused by:
org.apache.geode.internal.cache.execute.InternalFunctionInvocationTargetException:
memberDeparted event for <
heavy-lifter-ae0a174c-1be5-522e-8b3f-b521b672e4d1(GFECS24955:104622)<ec><v1>:41001
> crashed, false
at
org.apache.geode.internal.cache.partitioned.PRFunctionStreamingResultCollector.memberDeparted(PRFunctionStreamingResultCollector.java:375)
at
org.apache.geode.distributed.internal.ClusterDistributionManager$MemberDepartedEvent.handleEvent(ClusterDistributionManager.java:2502)
at
org.apache.geode.distributed.internal.ClusterDistributionManager$MemberEvent.handleEvent(ClusterDistributionManager.java:2432)
at
org.apache.geode.distributed.internal.ClusterDistributionManager$MemberEvent.handleEvent(ClusterDistributionManager.java:2421)
at
org.apache.geode.distributed.internal.ClusterDistributionManager.handleMemberEvent(ClusterDistributionManager.java:1401)
at
org.apache.geode.distributed.internal.ClusterDistributionManager.access$200(ClusterDistributionManager.java:108)
at
org.apache.geode.distributed.internal.ClusterDistributionManager$MemberEventInvoker.run(ClusterDistributionManager.java:1433)
... 1 more
) happened at remote server.
[info 2020/08/20 17:09:00.314091 UTC
heavy-lifter-ae0a174c-1be5-522e-8b3f-b521b672e4d1:104377 139991808305216] Close
connection message failed with msg: TcrConnection::send: connection failure
[info 2020/08/20 17:09:00.314303 UTC
heavy-lifter-ae0a174c-1be5-522e-8b3f-b521b672e4d1:104377 139991808305216]
Removing bucketServerLocation
[heavy-lifter-ae0a174c-1be5-522e-8b3f-b521b672e4d1.c.gemfire-dev.internal:24955]--1-0-0
due to GF_IOERR0% tests passed, 1 tests failed out of 1{quote}
> Several threads calling PdxInstanceFactory::create() causes seg fault
> ---------------------------------------------------------------------
>
> Key: GEODE-8436
> URL: https://issues.apache.org/jira/browse/GEODE-8436
> Project: Geode
> Issue Type: Bug
> Components: native client
> Reporter: Alberto Bustamante Reyes
> Assignee: Alberto Bustamante Reyes
> Priority: Major
> Labels: pull-request-available
> Fix For: 1.14.0
>
> Attachments: main.cpp
>
>
> I have seen a problem when "PdxInstanceFactory::create()" is called by
> several threads that are registering the same new pdx type.
> The core is produced here:
> {code}
> void PdxInstanceImpl::toDataMutable(PdxWriter& writer) {
> auto pt = getPdxType();
> std::vector<std::shared_ptr<PdxFieldType>>* pdxFieldList =
> pt->getPdxFieldTypes();
> {code}
> The problem is that "getPdxType()" returns nullptr, so in the next line,
> there is segmentation fault when calling "pt->getPdxFieldTypes()".
> The issue can be reproduced using the attached client, and executing it using
> 8 threads. This is the stack got in gdb:
> {code}
> #0 apache::geode::client::PdxType::getPdxFieldTypes (this=0x0) at
> /home/alb3rtobr/CLionProjects/Nordix/geode-native/cppcache/src/PdxType.hpp:178
> #1 0x00007f43dc4651b7 in
> apache::geode::client::PdxInstanceImpl::toDataMutable (this=0x7f43c0001600,
> writer=...) at
> /home/alb3rtobr/CLionProjects/Nordix/geode-native/cppcache/src/PdxInstanceImpl.cpp:1336
> #2 0x00007f43dc4650fd in apache::geode::client::PdxInstanceImpl::toData
> (this=0x7f43c0001600, writer=...) at
> /home/alb3rtobr/CLionProjects/Nordix/geode-native/cppcache/src/PdxInstanceImpl.cpp:1327
> #3 0x00007f43dc444971 in apache::geode::client::PdxHelper::serializePdx
> (output=..., pdxObject=warning: RTTI symbol not found for class
> 'std::_Sp_counted_ptr_inplace<apache::geode::client::PdxInstanceImpl,
> std::allocator<apache::geode::client::PdxInstanceImpl>,
> (__gnu_cxx::_Lock_policy)2>'
> warning: RTTI symbol not found for class
> 'std::_Sp_counted_ptr_inplace<apache::geode::client::PdxInstanceImpl,
> std::allocator<apache::geode::client::PdxInstanceImpl>,
> (__gnu_cxx::_Lock_policy)2>'
> std::shared_ptr<apache::geode::client::PdxSerializable> (use count 3, weak
> count 0) = {...})
> at
> /home/alb3rtobr/CLionProjects/Nordix/geode-native/cppcache/src/PdxHelper.cpp:77
> #4 0x00007f43dc44b4bc in apache::geode::client::PdxInstanceFactory::create
> (this=0x7f43c7ffecc8) at
> /home/alb3rtobr/CLionProjects/Nordix/geode-native/cppcache/src/PdxInstanceFactory.cpp:53
> #5 0x000000000040de2f in doPut () at
> /home/alb3rtobr/CLionProjects/dummy-client/main.cpp:60
> #6 0x0000000000427767 in std::__invoke_impl<void, void (*)()>
> (__f=@0x2561aa8: 0x40d860 <doPut()>) at
> /usr/bin/../lib/gcc/x86_64-linux-gnu/7.5.0/../../../../include/c++/7.5.0/bits/invoke.h:60
> #7 0x00000000004276fd in std::__invoke<void (*)()> (__fn=@0x2561aa8:
> 0x40d860 <doPut()>) at
> /usr/bin/../lib/gcc/x86_64-linux-gnu/7.5.0/../../../../include/c++/7.5.0/bits/invoke.h:95
> #8 0x00000000004276d5 in std::thread::_Invoker<std::tuple<void (*)()>
> >::_M_invoke<0ul> (this=0x2561aa8) at
> /usr/bin/../lib/gcc/x86_64-linux-gnu/7.5.0/../../../../include/c++/7.5.0/thread:234
> #9 0x00000000004276a5 in std::thread::_Invoker<std::tuple<void (*)()>
> >::operator() (this=0x2561aa8) at
> /usr/bin/../lib/gcc/x86_64-linux-gnu/7.5.0/../../../../include/c++/7.5.0/thread:243
> #10 0x0000000000427589 in
> std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (*)()> >
> >::_M_run (this=0x2561aa0)
> {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)