Re: [PR] [#5361] improvment(hadoop-catalog): Introduce a timeout mechanism to get Hadoop File System. [gravitino]
yuqi1129 commented on PR #5406: URL: https://github.com/apache/gravitino/pull/5406#issuecomment-2519470208 @jerryshao Do you have any suggestions on this issue and should I proceed with it? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [PR] [#5361] improvment(hadoop-catalog): Introduce a timeout mechanism to get Hadoop File System. [gravitino]
yuqi1129 commented on code in PR #5406:
URL: https://github.com/apache/gravitino/pull/5406#discussion_r1824368270
##
catalogs/catalog-hadoop/src/main/java/org/apache/gravitino/catalog/hadoop/HadoopCatalogOperations.java:
##
@@ -774,6 +778,27 @@ FileSystem getFileSystem(Path path, Map
config) throws IOExcepti
scheme, path, fileSystemProvidersMap.keySet(),
fileSystemProvidersMap.values()));
}
-return provider.getFileSystem(path, config);
+int timeoutSeconds =
+(int)
+propertiesMetadata
+.catalogPropertiesMetadata()
+.getOrDefault(config,
HadoopCatalogPropertiesMetadata.GET_FILESYSTEM_TIMEOUT_SECONDS);
+try {
+ AtomicReference fileSystem = new AtomicReference<>();
+ Awaitility.await()
+ .atMost(timeoutSeconds, TimeUnit.SECONDS)
+ .until(
+ () -> {
+fileSystem.set(provider.getFileSystem(path, config));
Review Comment:
If the user sets an incorrect endpoint, the client will retry to get the
connection for a certain amount of time.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] [#5361] improvment(hadoop-catalog): Introduce a timeout mechanism to get Hadoop File System. [gravitino]
jerryshao commented on code in PR #5406:
URL: https://github.com/apache/gravitino/pull/5406#discussion_r1824341558
##
catalogs/catalog-hadoop/src/main/java/org/apache/gravitino/catalog/hadoop/HadoopCatalogOperations.java:
##
@@ -774,6 +778,27 @@ FileSystem getFileSystem(Path path, Map
config) throws IOExcepti
scheme, path, fileSystemProvidersMap.keySet(),
fileSystemProvidersMap.values()));
}
-return provider.getFileSystem(path, config);
+int timeoutSeconds =
+(int)
+propertiesMetadata
+.catalogPropertiesMetadata()
+.getOrDefault(config,
HadoopCatalogPropertiesMetadata.GET_FILESYSTEM_TIMEOUT_SECONDS);
+try {
+ AtomicReference fileSystem = new AtomicReference<>();
+ Awaitility.await()
+ .atMost(timeoutSeconds, TimeUnit.SECONDS)
+ .until(
+ () -> {
+fileSystem.set(provider.getFileSystem(path, config));
Review Comment:
Why it is so time-consuming to initialize the filesystem client?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] [#5361] improvment(hadoop-catalog): Introduce a timeout mechanism to get Hadoop File System. [gravitino]
jerryshao commented on code in PR #5406:
URL: https://github.com/apache/gravitino/pull/5406#discussion_r1824422638
##
catalogs/catalog-hadoop/src/main/java/org/apache/gravitino/catalog/hadoop/HadoopCatalogOperations.java:
##
@@ -774,6 +778,27 @@ FileSystem getFileSystem(Path path, Map
config) throws IOExcepti
scheme, path, fileSystemProvidersMap.keySet(),
fileSystemProvidersMap.values()));
}
-return provider.getFileSystem(path, config);
+int timeoutSeconds =
+(int)
+propertiesMetadata
+.catalogPropertiesMetadata()
+.getOrDefault(config,
HadoopCatalogPropertiesMetadata.GET_FILESYSTEM_TIMEOUT_SECONDS);
+try {
+ AtomicReference fileSystem = new AtomicReference<>();
+ Awaitility.await()
+ .atMost(timeoutSeconds, TimeUnit.SECONDS)
+ .until(
+ () -> {
+fileSystem.set(provider.getFileSystem(path, config));
Review Comment:
I don't think you really fix this problem without using another thread to
create a FS and polling the status asynchronously?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] [#5361] improvment(hadoop-catalog): Introduce a timeout mechanism to get Hadoop File System. [gravitino]
yuqi1129 commented on PR #5406: URL: https://github.com/apache/gravitino/pull/5406#issuecomment-2449681635 @jerryshao Please help look if this should be included in release 0.7.0. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
