[ 
https://issues.apache.org/jira/browse/SOLR-11381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16795871#comment-16795871
 ] 

ASF subversion and git services commented on SOLR-11381:
--------------------------------------------------------

Commit 6064b03ac6f61b077dcfc6262568e466f2bf6467 in lucene-solr's branch 
refs/heads/branch_8x from Kevin Risden
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=6064b03ac ]

SOLR-13330: Improve HDFS tests

Related JIRAs:
* SOLR-11010
* SOLR-11381
* SOLR-12040
* SOLR-13297

Changes:
* Consolidate hdfs configuration into HdfsTestUtil
* Ensure socketTimeout long enough for HDFS tests
* Ensure HdfsTestUtil.getClientConfiguration used in tests
* Replace deprecated HDFS calls
* Use try-with-resources to ensure closing of HDFS resources

Signed-off-by: Kevin Risden <kris...@apache.org>


> HdfsDirectoryFactory throws NPE on cleanup because file system has been closed
> ------------------------------------------------------------------------------
>
>                 Key: SOLR-11381
>                 URL: https://issues.apache.org/jira/browse/SOLR-11381
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: Hadoop Integration, hdfs
>            Reporter: Shalin Shekhar Mangar
>            Priority: Trivial
>             Fix For: 8.1, master (9.0)
>
>
> I saw this happening on tests related to autoscaling. The old directory clean 
> up is triggered on core close in a separate thread. This can cause a race 
> condition where the filesystem is closed before the cleanup starts running. 
> Then a NPE is thrown and cleanup fails.
> Fixing the NPE is simple but I think this is a real bug where old directories 
> can be left around on HDFS. I don't know enough about HDFS to investigate 
> further. Leaving it here for interested people to pitch in.
> {code}
> 105029 ERROR 
> (OldIndexDirectoryCleanupThreadForCore-control_collection_shard1_replica_n1) 
> [n:127.0.0.1:58542_ c:control_collection s:shard1 r:core_node2 
> x:control_collection_shard1_replica_n1] o.a.s.c.HdfsDirectoryFactory Error 
> checking for old index directories to clean-up.
> java.io.IOException: Filesystem closed
>       at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:808)
>       at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:2083)
>       at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:2069)
>       at 
> org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:791)
>       at 
> org.apache.hadoop.hdfs.DistributedFileSystem.access$700(DistributedFileSystem.java:106)
>       at 
> org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:853)
>       at 
> org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:849)
>       at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>       at 
> org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:860)
>       at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1517)
>       at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1557)
>       at 
> org.apache.solr.core.HdfsDirectoryFactory.cleanupOldIndexDirectories(HdfsDirectoryFactory.java:540)
>       at 
> org.apache.solr.core.SolrCore.lambda$cleanupOldIndexDirectories$32(SolrCore.java:3019)
>       at java.lang.Thread.run(Thread.java:745)
> 105030 ERROR 
> (OldIndexDirectoryCleanupThreadForCore-control_collection_shard1_replica_n1) 
> [n:127.0.0.1:58542_ c:control_collection s:shard1 r:core_node2 
> x:control_collection_shard1_replica_n1] o.a.s.c.SolrCore Failed to cleanup 
> old index directories for core control_collection_shard1_replica_n1
> java.lang.NullPointerException
>       at 
> org.apache.solr.core.HdfsDirectoryFactory.cleanupOldIndexDirectories(HdfsDirectoryFactory.java:558)
>       at 
> org.apache.solr.core.SolrCore.lambda$cleanupOldIndexDirectories$32(SolrCore.java:3019)
>       at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to