Re: [DISCUSS] FLINK-17989 - java.lang.NoClassDefFoundError org.apache.flink.fs.azure.common.hadoop.HadoopRecoverableWriter

2020-05-28 Thread Israel Ekpo
Thanks Till.

I will take a look at that tomorrow and let you know if I hit any
roadblocks.

On Thu, May 28, 2020 at 12:11 PM Till Rohrmann  wrote:

> I think what needs to be done is to implement
> a org.apache.flink.core.fs.RecoverableWriter for the respective file
> system. Similar to HadoopRecoverableWriter and S3RecoverableWriter.
>
> Cheers,
> Till
>
> On Thu, May 28, 2020 at 6:00 PM Israel Ekpo  wrote:
>
>> Hi Till,
>>
>> Thanks for your feedback and guidance.
>>
>> It seems similar work was done for S3 filesystem where relocations were
>> removed for those file system plugins.
>>
>> https://issues.apache.org/jira/browse/FLINK-11956
>>
>> It appears the same needs to be done for Azure File systems.
>>
>> I will attempt to connect with Klou today to collaborate to see what the
>> level of effort is to add this support.
>>
>> Thanks.
>>
>>
>>
>> On Thu, May 28, 2020 at 11:54 AM Till Rohrmann 
>> wrote:
>>
>>> Hi Israel,
>>>
>>> thanks for reaching out to the Flink community. As Guowei said, the
>>> StreamingFileSink can currently only recover from faults if it writes to
>>> HDFS or S3. Other file systems are currently not supported if you need
>>> fault tolerance.
>>>
>>> Maybe Klou can tell you more about the background and what is needed to
>>> make it work with other file systems. He is one of the original authors of
>>> the StreamingFileSink.
>>>
>>> Cheers,
>>> Till
>>>
>>> On Thu, May 28, 2020 at 4:39 PM Israel Ekpo 
>>> wrote:
>>>
 Guowei,

 What do we need to do to add support for it?

 How do I get started on that?



 On Wed, May 27, 2020 at 8:53 PM Guowei Ma  wrote:

> Hi,
> I think the StreamingFileSink could not support Azure currently.
> You could find more detailed info from here[1].
>
> [1] https://issues.apache.org/jira/browse/FLINK-17444
> Best,
> Guowei
>
>
> Israel Ekpo  于2020年5月28日周四 上午6:04写道:
>
>> You can assign the task to me and I will like to collaborate with
>> someone to fix it.
>>
>> On Wed, May 27, 2020 at 5:52 PM Israel Ekpo 
>> wrote:
>>
>>> Some users are running into issues when using Azure Blob Storage for
>>> the StreamFileSink
>>>
>>> https://issues.apache.org/jira/browse/FLINK-17989
>>>
>>> The issue is because certain packages are relocated in the POM file
>>> and some classes are dropped in the final shaded jar
>>>
>>> I have attempted to comment out the relocated and recompile the
>>> source but I keep hitting roadblocks of other relocation and filtration
>>> each time I update a specific pom file
>>>
>>> How can this be addressed so that these users can be unblocked? Why
>>> are the classes filtered out? What is the workaround? I can work on the
>>> patch if I have some guidance.
>>>
>>> This is an issue in Flink 1.9 and 1.10 and I believe 1.11 has the
>>> same issue but I am yet to confirm
>>>
>>> Thanks.
>>>
>>>
>>>
>>


Re: [DISCUSS] FLINK-17989 - java.lang.NoClassDefFoundError org.apache.flink.fs.azure.common.hadoop.HadoopRecoverableWriter

2020-05-28 Thread Till Rohrmann
I think what needs to be done is to implement
a org.apache.flink.core.fs.RecoverableWriter for the respective file
system. Similar to HadoopRecoverableWriter and S3RecoverableWriter.

Cheers,
Till

On Thu, May 28, 2020 at 6:00 PM Israel Ekpo  wrote:

> Hi Till,
>
> Thanks for your feedback and guidance.
>
> It seems similar work was done for S3 filesystem where relocations were
> removed for those file system plugins.
>
> https://issues.apache.org/jira/browse/FLINK-11956
>
> It appears the same needs to be done for Azure File systems.
>
> I will attempt to connect with Klou today to collaborate to see what the
> level of effort is to add this support.
>
> Thanks.
>
>
>
> On Thu, May 28, 2020 at 11:54 AM Till Rohrmann 
> wrote:
>
>> Hi Israel,
>>
>> thanks for reaching out to the Flink community. As Guowei said, the
>> StreamingFileSink can currently only recover from faults if it writes to
>> HDFS or S3. Other file systems are currently not supported if you need
>> fault tolerance.
>>
>> Maybe Klou can tell you more about the background and what is needed to
>> make it work with other file systems. He is one of the original authors of
>> the StreamingFileSink.
>>
>> Cheers,
>> Till
>>
>> On Thu, May 28, 2020 at 4:39 PM Israel Ekpo  wrote:
>>
>>> Guowei,
>>>
>>> What do we need to do to add support for it?
>>>
>>> How do I get started on that?
>>>
>>>
>>>
>>> On Wed, May 27, 2020 at 8:53 PM Guowei Ma  wrote:
>>>
 Hi,
 I think the StreamingFileSink could not support Azure currently.
 You could find more detailed info from here[1].

 [1] https://issues.apache.org/jira/browse/FLINK-17444
 Best,
 Guowei


 Israel Ekpo  于2020年5月28日周四 上午6:04写道:

> You can assign the task to me and I will like to collaborate with
> someone to fix it.
>
> On Wed, May 27, 2020 at 5:52 PM Israel Ekpo 
> wrote:
>
>> Some users are running into issues when using Azure Blob Storage for
>> the StreamFileSink
>>
>> https://issues.apache.org/jira/browse/FLINK-17989
>>
>> The issue is because certain packages are relocated in the POM file
>> and some classes are dropped in the final shaded jar
>>
>> I have attempted to comment out the relocated and recompile the
>> source but I keep hitting roadblocks of other relocation and filtration
>> each time I update a specific pom file
>>
>> How can this be addressed so that these users can be unblocked? Why
>> are the classes filtered out? What is the workaround? I can work on the
>> patch if I have some guidance.
>>
>> This is an issue in Flink 1.9 and 1.10 and I believe 1.11 has the
>> same issue but I am yet to confirm
>>
>> Thanks.
>>
>>
>>
>


Re: [DISCUSS] FLINK-17989 - java.lang.NoClassDefFoundError org.apache.flink.fs.azure.common.hadoop.HadoopRecoverableWriter

2020-05-28 Thread Israel Ekpo
Hi Till,

Thanks for your feedback and guidance.

It seems similar work was done for S3 filesystem where relocations were
removed for those file system plugins.

https://issues.apache.org/jira/browse/FLINK-11956

It appears the same needs to be done for Azure File systems.

I will attempt to connect with Klou today to collaborate to see what the
level of effort is to add this support.

Thanks.



On Thu, May 28, 2020 at 11:54 AM Till Rohrmann  wrote:

> Hi Israel,
>
> thanks for reaching out to the Flink community. As Guowei said, the
> StreamingFileSink can currently only recover from faults if it writes to
> HDFS or S3. Other file systems are currently not supported if you need
> fault tolerance.
>
> Maybe Klou can tell you more about the background and what is needed to
> make it work with other file systems. He is one of the original authors of
> the StreamingFileSink.
>
> Cheers,
> Till
>
> On Thu, May 28, 2020 at 4:39 PM Israel Ekpo  wrote:
>
>> Guowei,
>>
>> What do we need to do to add support for it?
>>
>> How do I get started on that?
>>
>>
>>
>> On Wed, May 27, 2020 at 8:53 PM Guowei Ma  wrote:
>>
>>> Hi,
>>> I think the StreamingFileSink could not support Azure currently.
>>> You could find more detailed info from here[1].
>>>
>>> [1] https://issues.apache.org/jira/browse/FLINK-17444
>>> Best,
>>> Guowei
>>>
>>>
>>> Israel Ekpo  于2020年5月28日周四 上午6:04写道:
>>>
 You can assign the task to me and I will like to collaborate with
 someone to fix it.

 On Wed, May 27, 2020 at 5:52 PM Israel Ekpo 
 wrote:

> Some users are running into issues when using Azure Blob Storage for
> the StreamFileSink
>
> https://issues.apache.org/jira/browse/FLINK-17989
>
> The issue is because certain packages are relocated in the POM file
> and some classes are dropped in the final shaded jar
>
> I have attempted to comment out the relocated and recompile the source
> but I keep hitting roadblocks of other relocation and filtration each time
> I update a specific pom file
>
> How can this be addressed so that these users can be unblocked? Why
> are the classes filtered out? What is the workaround? I can work on the
> patch if I have some guidance.
>
> This is an issue in Flink 1.9 and 1.10 and I believe 1.11 has the same
> issue but I am yet to confirm
>
> Thanks.
>
>
>



Re: [DISCUSS] FLINK-17989 - java.lang.NoClassDefFoundError org.apache.flink.fs.azure.common.hadoop.HadoopRecoverableWriter

2020-05-28 Thread Till Rohrmann
Hi Israel,

thanks for reaching out to the Flink community. As Guowei said, the
StreamingFileSink can currently only recover from faults if it writes to
HDFS or S3. Other file systems are currently not supported if you need
fault tolerance.

Maybe Klou can tell you more about the background and what is needed to
make it work with other file systems. He is one of the original authors of
the StreamingFileSink.

Cheers,
Till

On Thu, May 28, 2020 at 4:39 PM Israel Ekpo  wrote:

> Guowei,
>
> What do we need to do to add support for it?
>
> How do I get started on that?
>
>
>
> On Wed, May 27, 2020 at 8:53 PM Guowei Ma  wrote:
>
>> Hi,
>> I think the StreamingFileSink could not support Azure currently.
>> You could find more detailed info from here[1].
>>
>> [1] https://issues.apache.org/jira/browse/FLINK-17444
>> Best,
>> Guowei
>>
>>
>> Israel Ekpo  于2020年5月28日周四 上午6:04写道:
>>
>>> You can assign the task to me and I will like to collaborate with
>>> someone to fix it.
>>>
>>> On Wed, May 27, 2020 at 5:52 PM Israel Ekpo 
>>> wrote:
>>>
 Some users are running into issues when using Azure Blob Storage for
 the StreamFileSink

 https://issues.apache.org/jira/browse/FLINK-17989

 The issue is because certain packages are relocated in the POM file and
 some classes are dropped in the final shaded jar

 I have attempted to comment out the relocated and recompile the source
 but I keep hitting roadblocks of other relocation and filtration each time
 I update a specific pom file

 How can this be addressed so that these users can be unblocked? Why are
 the classes filtered out? What is the workaround? I can work on the patch
 if I have some guidance.

 This is an issue in Flink 1.9 and 1.10 and I believe 1.11 has the same
 issue but I am yet to confirm

 Thanks.



>>>


Re: [DISCUSS] FLINK-17989 - java.lang.NoClassDefFoundError org.apache.flink.fs.azure.common.hadoop.HadoopRecoverableWriter

2020-05-28 Thread Israel Ekpo
Guowei,

What do we need to do to add support for it?

How do I get started on that?



On Wed, May 27, 2020 at 8:53 PM Guowei Ma  wrote:

> Hi,
> I think the StreamingFileSink could not support Azure currently.
> You could find more detailed info from here[1].
>
> [1] https://issues.apache.org/jira/browse/FLINK-17444
> Best,
> Guowei
>
>
> Israel Ekpo  于2020年5月28日周四 上午6:04写道:
>
>> You can assign the task to me and I will like to collaborate with someone
>> to fix it.
>>
>> On Wed, May 27, 2020 at 5:52 PM Israel Ekpo  wrote:
>>
>>> Some users are running into issues when using Azure Blob Storage for the
>>> StreamFileSink
>>>
>>> https://issues.apache.org/jira/browse/FLINK-17989
>>>
>>> The issue is because certain packages are relocated in the POM file and
>>> some classes are dropped in the final shaded jar
>>>
>>> I have attempted to comment out the relocated and recompile the source
>>> but I keep hitting roadblocks of other relocation and filtration each time
>>> I update a specific pom file
>>>
>>> How can this be addressed so that these users can be unblocked? Why are
>>> the classes filtered out? What is the workaround? I can work on the patch
>>> if I have some guidance.
>>>
>>> This is an issue in Flink 1.9 and 1.10 and I believe 1.11 has the same
>>> issue but I am yet to confirm
>>>
>>> Thanks.
>>>
>>>
>>>
>>


Re: [DISCUSS] FLINK-17989 - java.lang.NoClassDefFoundError org.apache.flink.fs.azure.common.hadoop.HadoopRecoverableWriter

2020-05-27 Thread Guowei Ma
Hi,
I think the StreamingFileSink could not support Azure currently.
You could find more detailed info from here[1].

[1] https://issues.apache.org/jira/browse/FLINK-17444
Best,
Guowei


Israel Ekpo  于2020年5月28日周四 上午6:04写道:

> You can assign the task to me and I will like to collaborate with someone
> to fix it.
>
> On Wed, May 27, 2020 at 5:52 PM Israel Ekpo  wrote:
>
>> Some users are running into issues when using Azure Blob Storage for the
>> StreamFileSink
>>
>> https://issues.apache.org/jira/browse/FLINK-17989
>>
>> The issue is because certain packages are relocated in the POM file and
>> some classes are dropped in the final shaded jar
>>
>> I have attempted to comment out the relocated and recompile the source
>> but I keep hitting roadblocks of other relocation and filtration each time
>> I update a specific pom file
>>
>> How can this be addressed so that these users can be unblocked? Why are
>> the classes filtered out? What is the workaround? I can work on the patch
>> if I have some guidance.
>>
>> This is an issue in Flink 1.9 and 1.10 and I believe 1.11 has the same
>> issue but I am yet to confirm
>>
>> Thanks.
>>
>>
>>
>


Re: [DISCUSS] FLINK-17989 - java.lang.NoClassDefFoundError org.apache.flink.fs.azure.common.hadoop.HadoopRecoverableWriter

2020-05-27 Thread Israel Ekpo
You can assign the task to me and I will like to collaborate with someone
to fix it.

On Wed, May 27, 2020 at 5:52 PM Israel Ekpo  wrote:

> Some users are running into issues when using Azure Blob Storage for the
> StreamFileSink
>
> https://issues.apache.org/jira/browse/FLINK-17989
>
> The issue is because certain packages are relocated in the POM file and
> some classes are dropped in the final shaded jar
>
> I have attempted to comment out the relocated and recompile the source but
> I keep hitting roadblocks of other relocation and filtration each time I
> update a specific pom file
>
> How can this be addressed so that these users can be unblocked? Why are
> the classes filtered out? What is the workaround? I can work on the patch
> if I have some guidance.
>
> This is an issue in Flink 1.9 and 1.10 and I believe 1.11 has the same
> issue but I am yet to confirm
>
> Thanks.
>
>
>