[
https://issues.apache.org/jira/browse/SOLR-13276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16798857#comment-16798857
]
Cao Manh Dat edited comment on SOLR-13276 at 4/4/19 9:44 AM:
-------------------------------------------------------------
Thanks Hoss, I tried to reproduce the log but even on a Windows machine, it is
hard to reproduce it.
It seems that even SolrCloudTest do see the same failure, attached the log. So
this seems that the failure does not introduced by changes made by this issue.
Through the attached log, I suspect the cause of problem is IndexFetcher is
kicked off when CoreContainer is shutting down, so the core is not be able to
released.
from: {{thetaphi_Lucene-Solr-8.x-Windows_69.log.txt}}
Two nodes are shutting down
{code}
[junit4] 2> 126085 INFO (jetty-closer-1876-thread-3) [ ]
o.a.s.c.CoreContainer Shutting down CoreContainer instance=1698101756
[junit4] 2> 126085 INFO (jetty-closer-1876-thread-3) [ ]
o.a.s.c.ZkController Remove node as live in
ZooKeeper:/live_nodes/127.0.0.1:61571_solr
[junit4] 2> 126085 INFO (jetty-closer-1876-thread-2) [ ]
o.a.s.c.CoreContainer Shutting down CoreContainer instance=1055741610
[junit4] 2> 126085 INFO (jetty-closer-1876-thread-2) [ ]
o.a.s.c.ZkController Remove node as live in
ZooKeeper:/live_nodes/127.0.0.1:61566_solr
{code}
After that, indexFetcher failed to close a core which lead to the leak error.
{code}
[junit4] 2> 151088 ERROR (indexFetcher-1096-thread-1) [ ]
o.a.s.c.CachingDirectoryFactory Error closing
directory:org.apache.solr.common.SolrException: Timeout waiting for all
directory ref counts to be released - gave up waiting on
CachedDir<<refCount=1;path=C:\Users\jenkins\workspace\Lucene-Solr-8.x-Windows\solr\build\solr-solrj\test\J0\temp\solr.client.solrj.impl.CloudHttp2SolrClientTest_6DA5B1A938CC311D-001\tempDir-006\node1\.\replicaTypesTestColl_shard2_replica_p10\data\index;done=true>>
{code}
Therefore I think that SOLR-13339 may be able to solve this failure.
was (Author: caomanhdat):
Thanks Hoss, I tried to reproduce the log but even on a Windows machine, it is
hard to reproduce it.
It seems that even SolrCloudTest do see the same failure, attached the log. So
this seems that the failure does not introduced by changes made by this issue.
Through the attached log, I suspect the cause of problem is IndexFetcher is
kicked off when CoreContainer is shutting down, so the core is not be able to
released.
from: {{thetaphi_Lucene-Solr-8.x-Windows_69.log.txt}}
Two nodes are shutting down
{code}
[junit4] 2> 126085 INFO (jetty-closer-1876-thread-3) [ ]
o.a.s.c.CoreContainer Shutting down CoreContainer instance=1698101756
[junit4] 2> 126085 INFO (jetty-closer-1876-thread-3) [ ]
o.a.s.c.ZkController Remove node as live in
ZooKeeper:/live_nodes/127.0.0.1:61571_solr
[junit4] 2> 126085 INFO (jetty-closer-1876-thread-2) [ ]
o.a.s.c.CoreContainer Shutting down CoreContainer instance=1055741610
[junit4] 2> 126085 INFO (jetty-closer-1876-thread-2) [ ]
o.a.s.c.ZkController Remove node as live in
ZooKeeper:/live_nodes/127.0.0.1:61566_solr
{code}
After that, indexFetcher failed to close a core which lead to the leak error.
{code}
[junit4] 2> 151088 ERROR (indexFetcher-1096-thread-1) [ ]
o.a.s.c.CachingDirectoryFactory Error closing
directory:org.apache.solr.common.SolrException: Timeout waiting for all
directory ref counts to be released - gave up waiting on
CachedDir<<refCount=1;path=C:\Users\jenkins\workspace\Lucene-Solr-8.x-Windows\solr\build\solr-solrj\test\J0\temp\solr.client.solrj.impl.CloudHttp2SolrClientTest_6DA5B1A938CC311D-001\tempDir-006\node1\.\replicaTypesTestColl_shard2_replica_p10\data\index;done=true>>
{code}
Therefore I think that SOLR-13336 may be able to solve this failure.
> Adding Http2 equivalent classes of CloudSolrClient and
> HttpClusterStateProvider
> --------------------------------------------------------------------------------
>
> Key: SOLR-13276
> URL: https://issues.apache.org/jira/browse/SOLR-13276
> Project: Solr
> Issue Type: Improvement
> Security Level: Public(Default Security Level. Issues are Public)
> Reporter: Cao Manh Dat
> Assignee: Cao Manh Dat
> Priority: Major
> Fix For: 8.1
>
> Attachments: SOLR-13276.patch, SOLR-13276.patch, SOLR-13276.patch,
> thetaphi-Lucene-Solr-master-Windows-7810.txt,
> thetaphi_Lucene-Solr-8.x-Windows_69.log.txt,
> thetaphi_Lucene-Solr-master-Windows_7754.log.txt
>
>
> Before we can move on and wipe out the usage of apache httpclient inside
> Solr-core. We need to create Http/2 equivalent classes of CloudSolrClient and
> HttpClusterStateProvider
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]