[
https://issues.apache.org/jira/browse/HADOOP-15547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16528280#comment-16528280
]
Steve Loughran commented on HADOOP-15547:
-----------------------------------------
I'm testing this, patch will have some fixes to make the default test thread &
file size smaller.
One thing I want to highlight is that this connector isn't good at handling bad
configs, in particular, UnknownHostException is considered retriable. I have
had to turn off all retries & backoff intervals to
begin debugging why my test is hanging. If I can't get the tests to fail
properly on unrecoverable exceptions, it's not going to be a good experience in
the field.
{code}
[ERROR]
test_0200_ListStatusPerformance(org.apache.hadoop.fs.azure.ITestListPerformance)
Time elapsed: 0.332 s <<< ERROR!
org.apache.hadoop.fs.azure.AzureException:
com.microsoft.azure.storage.StorageException:
at
org.apache.hadoop.fs.azure.AzureNativeFileSystemStore.retrieveMetadata(AzureNativeFileSystemStore.java:2152)
at
org.apache.hadoop.fs.azure.NativeAzureFileSystem.listStatus(NativeAzureFileSystem.java:2756)
at
org.apache.hadoop.fs.azure.ITestListPerformance.test_0200_ListStatusPerformance(ITestListPerformance.java:131)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
at
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
at
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
at
org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
Caused by: com.microsoft.azure.storage.StorageException:
at
com.microsoft.azure.storage.StorageException.translateException(StorageException.java:87)
at
com.microsoft.azure.storage.core.ExecutionEngine.executeWithRetry(ExecutionEngine.java:209)
at
com.microsoft.azure.storage.blob.CloudBlobContainer.downloadAttributes(CloudBlobContainer.java:570)
at
org.apache.hadoop.fs.azure.StorageInterfaceImpl$CloudBlobContainerWrapperImpl.downloadAttributes(StorageInterfaceImpl.java:255)
at
org.apache.hadoop.fs.azure.AzureNativeFileSystemStore.checkContainer(AzureNativeFileSystemStore.java:1279)
at
org.apache.hadoop.fs.azure.AzureNativeFileSystemStore.retrieveMetadata(AzureNativeFileSystemStore.java:2068)
... 14 more
Caused by: java.net.UnknownHostException: somehosthere
at
java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:184)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:589)
at sun.net.NetworkClient.doConnect(NetworkClient.java:175)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:432)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:527)
at sun.net.www.http.HttpClient.<init>(HttpClient.java:211)
at sun.net.www.http.HttpClient.New(HttpClient.java:308)
at sun.net.www.http.HttpClient.New(HttpClient.java:326)
at
sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:1202)
at
sun.net.www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1138)
at
sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:1032)
at
sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:966)
at
sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1546)
at
sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1474)
at
java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:480)
at
com.microsoft.azure.storage.core.ExecutionEngine.executeWithRetry(ExecutionEngine.java:115)
... 18 more
> WASB: listStatus performance
> ----------------------------
>
> Key: HADOOP-15547
> URL: https://issues.apache.org/jira/browse/HADOOP-15547
> Project: Hadoop Common
> Issue Type: Bug
> Components: fs/azure
> Affects Versions: 2.9.1, 3.0.2
> Reporter: Thomas Marquardt
> Assignee: Thomas Marquardt
> Priority: Major
> Attachments: HADOOP-15547.001.patch, HADOOP-15547.002.patch,
> HADOOP-15547.003.patch
>
>
> The WASB implementation of Filesystem.listStatus is very slow due to O(n!)
> algorithm to remove duplicates and uses too much memory due to the extra
> conversion from BlobListItem to FileMetadata to FileStatus. It takes over 30
> minutes to list 700,000 files.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]