I see that we can still use the other implementation, but were hoping that we'd 
benefit from the bug fix done in Flink 1.4.0 around 'repeated' load of 
configuration.  I'll check with the old implementation and see if it still 
works.

We also have seen discussions on a more native protocol that interfaces 
directly to IBM Object Storage that can be configured through the hdfs-site.xml 
called stocator that might speed things up. 

-----Original Message-----
From: Aljoscha Krettek [mailto:aljos...@apache.org] 
Sent: Thursday, January 25, 2018 6:30 PM
To: Marchant, Hayden [ICG-IT] <hm97...@imceu.eu.ssmb.com>
Cc: user@flink.apache.org
Subject: Re: S3 for state backend in Flink 1.4.0

Hi,

Did you try overriding that config and it didn't work? That dependency is in 
fact still using the Hadoop S3 FS implementation but is shading everything to 
our own namespace so that there can't be any version conflicts. If that doesn't 
work then we need to look into this further.

The way you usually use this is by putting the flink-s3-fs-hadoop jar from the 
opt/ folder to the lib/ folder. I'm not sure including it as a dependency will 
work but it might. You also don't have to use flink-s3-fs-hadoop dependency if 
using the regular Hadoop S3 support worked for you before. It's only an 
additional option.

Best,
Aljoscha

> On 24. Jan 2018, at 16:33, Marchant, Hayden <hayden.march...@citi.com> wrote:
> 
> Hi,
> 
> We have a Flink Streaming application that uses S3 for storing checkpoints. 
> We are not using 'regular' S3, but rather IBM Object Storage which has an 
> S3-compatible connector. We had quite some challenges in overiding the 
> endpoint from the default s3.amnazonaws.com to our internal IBM Object 
> Storage endpoint. In 1.3.2, we managed to get this working by providing our 
> own jets3t.properties file that overrode s3service.s3-endpoint 
> (https://urldefense.proofpoint.com/v2/url?u=https-3A__jets3t.s3.amazonaws.com_toolkit_configuration.html&d=DwIFAg&c=j-EkbjBYwkAB4f8ZbVn1Fw&r=g-5xYRH8L3aCnCNTROw5LrsB5gbTayWjXSm6Nil9x0c&m=pGMzFMafCab1RjHp3FDDKhlafEqeVPGytcX4PMbDk5Y&s=K2NJPrY_Mdv0u0B2CIvuckgr26dlraUJwZEU6aq5yXM&e=)
> 
> When upgrading to 1.4.0, we added dependency to the flink-s3-fs-hadoop 
> artifact. Seems that our overriding with jets3t.properties is no longer 
> relevant since does not use the Hadoop implementation anymore. 
> 
> Is there a way to overide this default endpoint, or with the presto endpoint 
> can we use this? Please note that if we provide the endpoint in the URL for 
> the state backend, it simply appends s3.amazonaws.com to the url. For example 
> s3://myobjectstorageendpoint.s3.amazonaws.com.
> 
> Are there any other solutions such as to 'rollback' to the Hadoop 
> implementation of S3?
> 
> Thanks,
> Hayden

Reply via email to