Shalin Shekhar Mangar created SOLR-11381:
--------------------------------------------
Summary: HdfsDirectoryFactory throws NPE on cleanup because file
system has been closed
Key: SOLR-11381
URL: https://issues.apache.org/jira/browse/SOLR-11381
Project: Solr
Issue Type: Bug
Security Level: Public (Default Security Level. Issues are Public)
Components: hdfs
Reporter: Shalin Shekhar Mangar
Priority: Trivial
Fix For: master (8.0), 7.1
I saw this happening on tests related to autoscaling. The old directory clean
up is triggered on core close in a separate thread. This can cause a race
condition where the filesystem is closed before the cleanup starts running.
Then a NPE is thrown and cleanup fails.
Fixing the NPE is simple but I think this is a real bug where old directories
can be left around on HDFS. I don't know enough about HDFS to investigate
further. Leaving it here for interested people to pitch in.
{code}
105029 ERROR
(OldIndexDirectoryCleanupThreadForCore-control_collection_shard1_replica_n1)
[n:127.0.0.1:58542_ c:control_collection s:shard1 r:core_node2
x:control_collection_shard1_replica_n1] o.a.s.c.HdfsDirectoryFactory Error
checking for old index directories to clean-up.
java.io.IOException: Filesystem closed
at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:808)
at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:2083)
at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:2069)
at
org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:791)
at
org.apache.hadoop.hdfs.DistributedFileSystem.access$700(DistributedFileSystem.java:106)
at
org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:853)
at
org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:849)
at
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at
org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:860)
at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1517)
at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1557)
at
org.apache.solr.core.HdfsDirectoryFactory.cleanupOldIndexDirectories(HdfsDirectoryFactory.java:540)
at
org.apache.solr.core.SolrCore.lambda$cleanupOldIndexDirectories$32(SolrCore.java:3019)
at java.lang.Thread.run(Thread.java:745)
105030 ERROR
(OldIndexDirectoryCleanupThreadForCore-control_collection_shard1_replica_n1)
[n:127.0.0.1:58542_ c:control_collection s:shard1 r:core_node2
x:control_collection_shard1_replica_n1] o.a.s.c.SolrCore Failed to cleanup old
index directories for core control_collection_shard1_replica_n1
java.lang.NullPointerException
at
org.apache.solr.core.HdfsDirectoryFactory.cleanupOldIndexDirectories(HdfsDirectoryFactory.java:558)
at
org.apache.solr.core.SolrCore.lambda$cleanupOldIndexDirectories$32(SolrCore.java:3019)
at java.lang.Thread.run(Thread.java:745)
{code}
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]