[jira] [Created] (HDFS-16825) hadoop-azure flush timing out and triggering retry

Mark Mc Keown (Jira) Wed, 26 Oct 2022 08:11:04 -0700

Mark Mc Keown created HDFS-16825:
------------------------------------

             Summary: hadoop-azure flush timing out and triggering retry
                 Key: HDFS-16825
                 URL: https://issues.apache.org/jira/browse/HDFS-16825
             Project: Hadoop HDFS
          Issue Type: Bug
            Reporter: Mark Mc Keown



>From AbfsHttpOperation the code to create a HTTP connection to Azure is:

{code}
public AbfsHttpOperation(final URL url, final String method, final 
List<AbfsHttpHeader> requestHeaders)
      throws IOException {
    this.isTraceEnabled = LOG.isTraceEnabled();
    this.url = url;
    this.method = method;
    this.clientRequestId = UUID.randomUUID().toString();

    this.connection = openConnection();
    if (this.connection instanceof HttpsURLConnection) {
      HttpsURLConnection secureConn = (HttpsURLConnection) this.connection;
      SSLSocketFactory sslSocketFactory = 
SSLSocketFactoryEx.getDefaultFactory();
      if (sslSocketFactory != null) {
        secureConn.setSSLSocketFactory(sslSocketFactory);
      }
    }

    this.connection.setConnectTimeout(CONNECT_TIMEOUT);
    this.connection.setReadTimeout(READ_TIMEOUT);

    this.connection.setRequestMethod(method);

    for (AbfsHttpHeader header : requestHeaders) {
      this.connection.setRequestProperty(header.getName(), header.getValue());
    }

    
this.connection.setRequestProperty(HttpHeaderConfigurations.X_MS_CLIENT_REQUEST_ID,
 clientRequestId);
  }
{code}

The READ_TIMEOUT is hard coded to 30 seconds. When a file uploaded to Azure and 
closed it triggers a flush operation - Azure sometimes takes longer than 30 
seconds  to respond and this is triggering a retry within hadoop-azure library. 

(This can cause issues with DataBricks Autoloader which monitors EventGrid for 
tiggers to ingest data - multiple flush/close can confuse it, this is an 
Autoloader bug as retries can happen normally).

Can the READ_TIMEOUT be increased or made configurable?





--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Created] (HDFS-16825) hadoop-azure flush timing out and triggering retry

Reply via email to