subject:"\[jira\] \[Commented\] \(FLINK\-15378\) StreamFileSystemSink supported mutil hdfs plugins."

[jira] [Commented] (FLINK-15378) StreamFileSystemSink supported mutil hdfs plugins.

2021-04-22 Thread Flink Jira Bot (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-15378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17328136#comment-17328136
 ] 

Flink Jira Bot commented on FLINK-15378:


This major issue is unassigned and itself and all of its Sub-Tasks have not 
been updated for 30 days. So, it has been labeled "stale-major". If this ticket 
is indeed "major", please either assign yourself or give an update. Afterwards, 
please remove the label. In 7 days the issue will be deprioritized.

> StreamFileSystemSink supported mutil hdfs plugins.
> --
>
> Key: FLINK-15378
> URL: https://issues.apache.org/jira/browse/FLINK-15378
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / FileSystem, FileSystems
>Affects Versions: 1.9.2, 1.10.0
>Reporter: ouyangwulin
>Priority: Major
>  Labels: pull-request-available, stale-major
> Fix For: 1.13.0
>
> Attachments: jobmananger.log
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> [As report from 
> maillist|[https://lists.apache.org/thread.html/7a6b1e341bde0ef632a82f8d46c9c93da358244b6bac0d8d544d11cb%40%3Cuser.flink.apache.org%3E]]
> Request 1:  FileSystem plugins not effect the default yarn dependecies.
> Request 2:  StreamFileSystemSink supported mutil hdfs plugins under the same 
> schema
> As Problem describe :
>     when I put a ' filesystem plugin to FLINK_HOME/pulgins in flink', and the 
> clas{color:#172b4d}s '*com.filesystem.plugin.FileSystemFactoryEnhance*' 
> implements '*FileSystemFactory*', when jm start, It will call 
> FileSystem.initialize(configuration, 
> PluginUtils.createPluginManagerFromRootFolder(configuration)) to load 
> factories to map  FileSystem#**{color}FS_FACTORIES, and the key is only 
> schema. When tm/jm use local hadoop conf A ,   the user code use hadoop conf 
> Bin 'filesystem plugin',  Conf A and Conf B is used to different hadoop 
> cluster. and The Jm will start failed, beacuse of the blodserver in JM will 
> load Conf B to get filesystem. the full log add appendix.
>  
> AS reslove method:
>     use  schema and spec identify as key for ' FileSystem#**FS_FACTORIES '
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-15378) StreamFileSystemSink supported mutil hdfs plugins.

2020-01-02 Thread ouyangwulin (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-15378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17006769#comment-17006769
 ] 

ouyangwulin commented on FLINK-15378:
-

[~pnowojski] thanks for your reply. 
{noformat}
For one thing, the current approach in the proposed PR is not generic enough. 
It limits the support for different configurations to just 
StreamingFileSink.{noformat}
{color:#172b4d}{color}I think it can use for tm/jm , change tm/jm  to get 
filesytem use FileSystem.get(uri, identify).

 
{noformat}
If we allow to identify plugins by parts from the URI (for example host or port 
as suggested by Yang Wang ), that would be better.{noformat}
I think use identify from the URI, It must change tm/jm to filesystem method. 
and another identify  it seems more ‘flexible’
{noformat}
one plugin could be used by different file system instances.{noformat}
It's will good idea. I will read the code again, and try to find how to 
implement it.

 

> StreamFileSystemSink supported mutil hdfs plugins.
> --
>
> Key: FLINK-15378
> URL: https://issues.apache.org/jira/browse/FLINK-15378
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / FileSystem, FileSystems
>Affects Versions: 1.9.2, 1.10.0
>Reporter: ouyangwulin
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.11.0
>
> Attachments: jobmananger.log
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> [As report from 
> maillist|[https://lists.apache.org/thread.html/7a6b1e341bde0ef632a82f8d46c9c93da358244b6bac0d8d544d11cb%40%3Cuser.flink.apache.org%3E]]
> Request 1:  FileSystem plugins not effect the default yarn dependecies.
> Request 2:  StreamFileSystemSink supported mutil hdfs plugins under the same 
> schema
> As Problem describe :
>     when I put a ' filesystem plugin to FLINK_HOME/pulgins in flink', and the 
> clas{color:#172b4d}s '*com.filesystem.plugin.FileSystemFactoryEnhance*' 
> implements '*FileSystemFactory*', when jm start, It will call 
> FileSystem.initialize(configuration, 
> PluginUtils.createPluginManagerFromRootFolder(configuration)) to load 
> factories to map  FileSystem#**{color}FS_FACTORIES, and the key is only 
> schema. When tm/jm use local hadoop conf A ,   the user code use hadoop conf 
> Bin 'filesystem plugin',  Conf A and Conf B is used to different hadoop 
> cluster. and The Jm will start failed, beacuse of the blodserver in JM will 
> load Conf B to get filesystem. the full log add appendix.
>  
> AS reslove method:
>     use  schema and spec identify as key for ' FileSystem#**FS_FACTORIES '
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-15378) StreamFileSystemSink supported mutil hdfs plugins.

2020-01-02 Thread Piotr Nowojski (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-15378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17006692#comment-17006692
 ] 

Piotr Nowojski commented on FLINK-15378:


Thanks for the explanations, I think I know understand the issue.

For one thing, the current approach in the [proposed 
PR|https://github.com/apache/flink/pull/10686/] is not generic enough. It 
limits the support for different configurations to just {{StreamingFileSink}}. 
If we allow to identify plugins by parts from the URI (for example {{host}} or 
{{port}} as suggested by [~fly_in_gis] ), that would be better.

However I see couple of issues/follow up thoughts. 

For example, we would probably need some config file, that would say, that if 
you are using {{hdfs}} to talk to {{namenode1}} you must use {{conf A}}, while 
if you are writing to {{namenode2}} you should use {{conf B}}. I'm not sure how 
to express this. Just copying pasting whole fat jar two different plugins 
directories, with two different configs is one option, but...

I don't think changes in configuration, like different {{hdfs-site.xml}}, 
should enforce creation of another fat-jar, for the same reason as:
{quote}
They share the same schema "hdfs" and it will be not convenient and confusing 
for users if we changes the schema. 
{quote}
I agree both sinks writing to {{namenode1}} with {{conf A}} and to 
{{namenode2}} with {{conf B}} should be using the same schema, but they should 
also be using same plugin. 

I have to think a bit about this. Maybe we should decouple concept of plugin 
from a concept of the filesystem - one plugin could be used by different file 
system instances.

> StreamFileSystemSink supported mutil hdfs plugins.
> --
>
> Key: FLINK-15378
> URL: https://issues.apache.org/jira/browse/FLINK-15378
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / FileSystem, FileSystems
>Affects Versions: 1.9.2, 1.10.0
>Reporter: ouyangwulin
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.11.0
>
> Attachments: jobmananger.log
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> [As report from 
> maillist|[https://lists.apache.org/thread.html/7a6b1e341bde0ef632a82f8d46c9c93da358244b6bac0d8d544d11cb%40%3Cuser.flink.apache.org%3E]]
> Request 1:  FileSystem plugins not effect the default yarn dependecies.
> Request 2:  StreamFileSystemSink supported mutil hdfs plugins under the same 
> schema
> As Problem describe :
>     when I put a ' filesystem plugin to FLINK_HOME/pulgins in flink', and the 
> clas{color:#172b4d}s '*com.filesystem.plugin.FileSystemFactoryEnhance*' 
> implements '*FileSystemFactory*', when jm start, It will call 
> FileSystem.initialize(configuration, 
> PluginUtils.createPluginManagerFromRootFolder(configuration)) to load 
> factories to map  FileSystem#**{color}FS_FACTORIES, and the key is only 
> schema. When tm/jm use local hadoop conf A ,   the user code use hadoop conf 
> Bin 'filesystem plugin',  Conf A and Conf B is used to different hadoop 
> cluster. and The Jm will start failed, beacuse of the blodserver in JM will 
> load Conf B to get filesystem. the full log add appendix.
>  
> AS reslove method:
>     use  schema and spec identify as key for ' FileSystem#**FS_FACTORIES '
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-15378) StreamFileSystemSink supported mutil hdfs plugins.

2020-01-01 Thread ouyangwulin (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-15378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17006568#comment-17006568
 ] 

ouyangwulin commented on FLINK-15378:
-

[~fly_in_gis] Also， I need different kerberos auth between the a hdfs  and b 
hdfs.

> StreamFileSystemSink supported mutil hdfs plugins.
> --
>
> Key: FLINK-15378
> URL: https://issues.apache.org/jira/browse/FLINK-15378
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / FileSystem, FileSystems
>Affects Versions: 1.9.2, 1.10.0
>Reporter: ouyangwulin
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.11.0
>
> Attachments: jobmananger.log
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> [As report from 
> maillist|[https://lists.apache.org/thread.html/7a6b1e341bde0ef632a82f8d46c9c93da358244b6bac0d8d544d11cb%40%3Cuser.flink.apache.org%3E]]
> Request 1:  FileSystem plugins not effect the default yarn dependecies.
> Request 2:  StreamFileSystemSink supported mutil hdfs plugins under the same 
> schema
> As Problem describe :
>     when I put a ' filesystem plugin to FLINK_HOME/pulgins in flink', and the 
> clas{color:#172b4d}s '*com.filesystem.plugin.FileSystemFactoryEnhance*' 
> implements '*FileSystemFactory*', when jm start, It will call 
> FileSystem.initialize(configuration, 
> PluginUtils.createPluginManagerFromRootFolder(configuration)) to load 
> factories to map  FileSystem#**{color}FS_FACTORIES, and the key is only 
> schema. When tm/jm use local hadoop conf A ,   the user code use hadoop conf 
> Bin 'filesystem plugin',  Conf A and Conf B is used to different hadoop 
> cluster. and The Jm will start failed, beacuse of the blodserver in JM will 
> load Conf B to get filesystem. the full log add appendix.
>  
> AS reslove method:
>     use  schema and spec identify as key for ' FileSystem#**FS_FACTORIES '
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-15378) StreamFileSystemSink supported mutil hdfs plugins.

2020-01-01 Thread Yang Wang (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-15378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17006562#comment-17006562
 ] 

Yang Wang commented on FLINK-15378:
---

Hi [~pnowojski], you are right. the conf are bundled inside the fat plugin jar.

[~ouyangwuli] means to use the same hdfs plugin with different configs. They 
share the same schema "hdfs" and it will be not convenient and confusing for 
users if we changes the schema. So he want to use {{schema://host:port}} to be 
identifier of different plugins. A hdfs plugin could be 
{{hdfs://namenode1:port}}. B hdfs plugin could be {{hdfs://namenode2:port}}.

 

If you have other suggestion, just point out. Let's find out a best way to 
solve this problem. 

> StreamFileSystemSink supported mutil hdfs plugins.
> --
>
> Key: FLINK-15378
> URL: https://issues.apache.org/jira/browse/FLINK-15378
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / FileSystem, FileSystems
>Affects Versions: 1.9.2, 1.10.0
>Reporter: ouyangwulin
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.11.0
>
> Attachments: jobmananger.log
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> [As report from 
> maillist|[https://lists.apache.org/thread.html/7a6b1e341bde0ef632a82f8d46c9c93da358244b6bac0d8d544d11cb%40%3Cuser.flink.apache.org%3E]]
> Request 1:  FileSystem plugins not effect the default yarn dependecies.
> Request 2:  StreamFileSystemSink supported mutil hdfs plugins under the same 
> schema
> As Problem describe :
>     when I put a ' filesystem plugin to FLINK_HOME/pulgins in flink', and the 
> clas{color:#172b4d}s '*com.filesystem.plugin.FileSystemFactoryEnhance*' 
> implements '*FileSystemFactory*', when jm start, It will call 
> FileSystem.initialize(configuration, 
> PluginUtils.createPluginManagerFromRootFolder(configuration)) to load 
> factories to map  FileSystem#**{color}FS_FACTORIES, and the key is only 
> schema. When tm/jm use local hadoop conf A ,   the user code use hadoop conf 
> Bin 'filesystem plugin',  Conf A and Conf B is used to different hadoop 
> cluster. and The Jm will start failed, beacuse of the blodserver in JM will 
> load Conf B to get filesystem. the full log add appendix.
>  
> AS reslove method:
>     use  schema and spec identify as key for ' FileSystem#**FS_FACTORIES '
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-15378) StreamFileSystemSink supported mutil hdfs plugins.

2020-01-01 Thread ouyangwulin (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-15378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17006557#comment-17006557
 ] 

ouyangwulin commented on FLINK-15378:
-

{code:java}
 do I understand your problem correctly, that you are trying to use the same 
plugin, but with different configs?{code}
   yes , but the same plugin how to use different configs when it only create 
one FilesystemFactory.
{code:java}
Can not you create a separate plugin but just with a different schema, instead 
of adding different identity? {code}
   I want to sink to different hdfs cluster. Schema is native same in different 
cluster. So I want add a identify to get the different.
{code:java}
where are the "conf A", "conf B"  and hdfs-site.xml files located? Are they 
bundled inside the plugin's fat jar? {code}
yes.

> StreamFileSystemSink supported mutil hdfs plugins.
> --
>
> Key: FLINK-15378
> URL: https://issues.apache.org/jira/browse/FLINK-15378
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / FileSystem, FileSystems
>Affects Versions: 1.9.2, 1.10.0
>Reporter: ouyangwulin
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.11.0
>
> Attachments: jobmananger.log
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> [As report from 
> maillist|[https://lists.apache.org/thread.html/7a6b1e341bde0ef632a82f8d46c9c93da358244b6bac0d8d544d11cb%40%3Cuser.flink.apache.org%3E]]
> Request 1:  FileSystem plugins not effect the default yarn dependecies.
> Request 2:  StreamFileSystemSink supported mutil hdfs plugins under the same 
> schema
> As Problem describe :
>     when I put a ' filesystem plugin to FLINK_HOME/pulgins in flink', and the 
> clas{color:#172b4d}s '*com.filesystem.plugin.FileSystemFactoryEnhance*' 
> implements '*FileSystemFactory*', when jm start, It will call 
> FileSystem.initialize(configuration, 
> PluginUtils.createPluginManagerFromRootFolder(configuration)) to load 
> factories to map  FileSystem#**{color}FS_FACTORIES, and the key is only 
> schema. When tm/jm use local hadoop conf A ,   the user code use hadoop conf 
> Bin 'filesystem plugin',  Conf A and Conf B is used to different hadoop 
> cluster. and The Jm will start failed, beacuse of the blodserver in JM will 
> load Conf B to get filesystem. the full log add appendix.
>  
> AS reslove method:
>     use  schema and spec identify as key for ' FileSystem#**FS_FACTORIES '
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-15378) StreamFileSystemSink supported mutil hdfs plugins.

2019-12-31 Thread Piotr Nowojski (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-15378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17006060#comment-17006060
 ] 

Piotr Nowojski commented on FLINK-15378:


[~ouyangwuli]  do I understand your problem correctly, that you are trying to 
use the same plugin, but with different configs? Can not you create a separate 
plugin but just with a different schema, instead of adding different 
{{identity}}? 

 

[~ouyangwuli] [~fly_in_gis] where are the "conf A", "conf B"  and 
{{hdfs-site.xml}} files located? Are they bundled inside the plugin's fat jar? 

> StreamFileSystemSink supported mutil hdfs plugins.
> --
>
> Key: FLINK-15378
> URL: https://issues.apache.org/jira/browse/FLINK-15378
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / FileSystem, FileSystems
>Affects Versions: 1.9.2, 1.10.0
>Reporter: ouyangwulin
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.11.0
>
> Attachments: jobmananger.log
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> [As report from 
> maillist|[https://lists.apache.org/thread.html/7a6b1e341bde0ef632a82f8d46c9c93da358244b6bac0d8d544d11cb%40%3Cuser.flink.apache.org%3E]]
> Request 1:  FileSystem plugins not effect the default yarn dependecies.
> Request 2:  StreamFileSystemSink supported mutil hdfs plugins under the same 
> schema
> As Problem describe :
>     when I put a ' filesystem plugin to FLINK_HOME/pulgins in flink', and the 
> clas{color:#172b4d}s '*com.filesystem.plugin.FileSystemFactoryEnhance*' 
> implements '*FileSystemFactory*', when jm start, It will call 
> FileSystem.initialize(configuration, 
> PluginUtils.createPluginManagerFromRootFolder(configuration)) to load 
> factories to map  FileSystem#**{color}FS_FACTORIES, and the key is only 
> schema. When tm/jm use local hadoop conf A ,   the user code use hadoop conf 
> Bin 'filesystem plugin',  Conf A and Conf B is used to different hadoop 
> cluster. and The Jm will start failed, beacuse of the blodserver in JM will 
> load Conf B to get filesystem. the full log add appendix.
>  
> AS reslove method:
>     use  schema and spec identify as key for ' FileSystem#**FS_FACTORIES '
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-15378) StreamFileSystemSink supported mutil hdfs plugins.

2019-12-27 Thread ouyangwulin (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-15378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17004032#comment-17004032
 ] 

ouyangwulin commented on FLINK-15378:
-

[~fly_in_gis] https://issues.apache.org/jira/browse/FLINK-15355 This issue is 
fixed the plugin problem.

> StreamFileSystemSink supported mutil hdfs plugins.
> --
>
> Key: FLINK-15378
> URL: https://issues.apache.org/jira/browse/FLINK-15378
> Project: Flink
>  Issue Type: Improvement
>  Components: API / Core
>Affects Versions: 1.9.2, 1.11.0
>Reporter: ouyangwulin
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.11.0
>
> Attachments: jobmananger.log
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> [As report from 
> maillist|[https://lists.apache.org/thread.html/7a6b1e341bde0ef632a82f8d46c9c93da358244b6bac0d8d544d11cb%40%3Cuser.flink.apache.org%3E]]
> Request 1:  FileSystem plugins not effect the default yarn dependecies.
> Request 2:  StreamFileSystemSink supported mutil hdfs plugins under the same 
> schema
> As Problem describe :
>     when I put a ' filesystem plugin to FLINK_HOME/pulgins in flink', and the 
> clas{color:#172b4d}s '*com.filesystem.plugin.FileSystemFactoryEnhance*' 
> implements '*FileSystemFactory*', when jm start, It will call 
> FileSystem.initialize(configuration, 
> PluginUtils.createPluginManagerFromRootFolder(configuration)) to load 
> factories to map  FileSystem#**{color}FS_FACTORIES, and the key is only 
> schema. When tm/jm use local hadoop conf A ,   the user code use hadoop conf 
> Bin 'filesystem plugin',  Conf A and Conf B is used to different hadoop 
> cluster. and The Jm will start failed, beacuse of the blodserver in JM will 
> load Conf B to get filesystem. the full log add appendix.
>  
> AS reslove method:
>     use  schema and spec identify as key for ' FileSystem#**FS_FACTORIES '
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-15378) StreamFileSystemSink supported mutil hdfs plugins.

2019-12-25 Thread ouyangwulin (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-15378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17003237#comment-17003237
 ] 

ouyangwulin commented on FLINK-15378:
-

[~fly_in_gis]

 >> For request 1, 

    Change plugin to not use '{{classloader.parent-first-patterns.default}} ' 
is a good idea。

>> For request 2,

    Aggregated hdfs-site.xml can support multiple hdfs cluster in same 
kerberoes, It create one filesystem instantiate to write to diff hdfs 
cluster。But in our sense，We have one hdfs cluster with kerberoes， Other cluster 
is don't need kerberoes。It also not need plugin in `$FLINK_HOME/pulgins` ，The 
user jar can content FileSystem plugin。

> StreamFileSystemSink supported mutil hdfs plugins.
> --
>
> Key: FLINK-15378
> URL: https://issues.apache.org/jira/browse/FLINK-15378
> Project: Flink
>  Issue Type: Improvement
>  Components: API / Core
>Affects Versions: 1.9.2, 1.11.0
>Reporter: ouyangwulin
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.11.0
>
> Attachments: jobmananger.log
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> [As report from 
> maillist|[https://lists.apache.org/thread.html/7a6b1e341bde0ef632a82f8d46c9c93da358244b6bac0d8d544d11cb%40%3Cuser.flink.apache.org%3E]]
> Request 1:  FileSystem plugins not effect the default yarn dependecies.
> Request 2:  StreamFileSystemSink supported mutil hdfs plugins under the same 
> schema
> As Problem describe :
>     when I put a ' filesystem plugin to FLINK_HOME/pulgins in flink', and the 
> clas{color:#172b4d}s '*com.filesystem.plugin.FileSystemFactoryEnhance*' 
> implements '*FileSystemFactory*', when jm start, It will call 
> FileSystem.initialize(configuration, 
> PluginUtils.createPluginManagerFromRootFolder(configuration)) to load 
> factories to map  FileSystem#**{color}FS_FACTORIES, and the key is only 
> schema. When tm/jm use local hadoop conf A ,   the user code use hadoop conf 
> Bin 'filesystem plugin',  Conf A and Conf B is used to different hadoop 
> cluster. and The Jm will start failed, beacuse of the blodserver in JM will 
> load Conf B to get filesystem. the full log add appendix.
>  
> AS reslove method:
>     use  schema and spec identify as key for ' FileSystem#**FS_FACTORIES '
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-15378) StreamFileSystemSink supported mutil hdfs plugins.

2019-12-25 Thread Yang Wang (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-15378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17003176#comment-17003176
 ] 

Yang Wang commented on FLINK-15378:
---

Thanks for creating this ticket.

>> For request 1

AFAIK, we could not support using hadoop FileSystem as a plugin. Because 
flink-shaded-hadoop has been put in lib directory and will be added to 
framework classpath. Also flink will always load the "org.apache.hadoop" from 
parent classloader. You could check 
{{classloader.parent-first-patterns.default}} config options.

>> For request 2

So if you want to use multiple hdfs cluster in a flink cluster, maybe plugin 
mechanism is not a good choice. You could ship an aggregated hdfs-site.xml to 
support multiple hdfs cluster.

> StreamFileSystemSink supported mutil hdfs plugins.
> --
>
> Key: FLINK-15378
> URL: https://issues.apache.org/jira/browse/FLINK-15378
> Project: Flink
>  Issue Type: Improvement
>  Components: API / Core
>Affects Versions: 1.9.2, 1.11.0
>Reporter: ouyangwulin
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.11.0
>
> Attachments: jobmananger.log
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> [As report from 
> maillist|[https://lists.apache.org/thread.html/7a6b1e341bde0ef632a82f8d46c9c93da358244b6bac0d8d544d11cb%40%3Cuser.flink.apache.org%3E]]
> Request 1:  FileSystem plugins not effect the default yarn dependecies.
> Request 2:  StreamFileSystemSink supported mutil hdfs plugins under the same 
> schema
> As Problem describe :
>     when I put a ' filesystem plugin to FLINK_HOME/pulgins in flink', and the 
> clas{color:#172b4d}s '*com.filesystem.plugin.FileSystemFactoryEnhance*' 
> implements '*FileSystemFactory*', when jm start, It will call 
> FileSystem.initialize(configuration, 
> PluginUtils.createPluginManagerFromRootFolder(configuration)) to load 
> factories to map  FileSystem#**{color}FS_FACTORIES, and the key is only 
> schema. When tm/jm use local hadoop conf A ,   the user code use hadoop conf 
> Bin 'filesystem plugin',  Conf A and Conf B is used to different hadoop 
> cluster. and The Jm will start failed, beacuse of the blodserver in JM will 
> load Conf B to get filesystem. the full log add appendix.
>  
> AS reslove method:
>     use  schema and spec identify as key for ' FileSystem#**FS_FACTORIES '
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-15378) StreamFileSystemSink supported mutil hdfs plugins.

2019-12-24 Thread ouyangwulin (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-15378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17002719#comment-17002719
 ] 

ouyangwulin commented on FLINK-15378:
-

[~pnowojski] When I read the code ClusterEntrypoint#startCluster,It load 
plugins in FileSystem#**FS_FACTORIES when startcluster, So the code 
'initializeWithoutPlugins' is  not work. I think it is a conflict with 
‘FileSystem#getUnguardedFileSystem’。

[~wangy] Do you think ‘ use  schema and authority as key for ' 
FileSystem#**FS_FACTORIES ' is suitable?

> StreamFileSystemSink supported mutil hdfs plugins.
> --
>
> Key: FLINK-15378
> URL: https://issues.apache.org/jira/browse/FLINK-15378
> Project: Flink
>  Issue Type: Improvement
>  Components: API / Core
>Affects Versions: 1.9.2, 1.11.0
>Reporter: ouyangwulin
>Priority: Major
> Fix For: 1.11.0
>
> Attachments: jobmananger.log
>
>
> Request 1:  FileSystem plugins not effect the default yarn dependecies.
> Request 2:  StreamFileSystemSink supported mutil hdfs plugins.    
> As Problem describe :
>     when I put a ' filesystem plugin to FLINK_HOME/pulgins in flink', and the 
> clas{color:#172b4d}s '*com.filesystem.plugin.FileSystemFactoryEnhance*' 
> implements '*FileSystemFactory*', when jm start, It will call 
> FileSystem.initialize(configuration, 
> PluginUtils.createPluginManagerFromRootFolder(configuration)) to load 
> factories to map  FileSystem#**{color}FS_FACTORIES, and the key is only 
> schema. When tm/jm use local hadoop conf A ,   the user code use hadoop conf 
> Bin 'filesystem plugin',  Conf A and Conf B is used to different hadoop 
> cluster. and The Jm will start failed, beacuse of the blodserver in JM will 
> load Conf B to get filesystem. the full log add appendix.
>  
> AS reslove method:
>     use  schema and authority as key for ' FileSystem#**FS_FACTORIES '
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-15378) StreamFileSystemSink supported mutil hdfs plugins.

2019-12-23 Thread ouyangwulin (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-15378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17002704#comment-17002704
 ] 

ouyangwulin commented on FLINK-15378:
-

Please assign the for me！

> StreamFileSystemSink supported mutil hdfs plugins.
> --
>
> Key: FLINK-15378
> URL: https://issues.apache.org/jira/browse/FLINK-15378
> Project: Flink
>  Issue Type: Improvement
>  Components: API / Core
>Affects Versions: 1.9.2, 1.11.0
>Reporter: ouyangwulin
>Priority: Major
> Fix For: 1.11.0
>
> Attachments: jobmananger.log
>
>
> Request 1:  FileSystem plugins not effect the default yarn dependecies.
> Request 2:  StreamFileSystemSink supported mutil hdfs plugins.    
> As Problem describe :
>     when I put a ' filesystem plugin to FLINK_HOME/pulgins in flink', and the 
> clas{color:#172b4d}s '*com.filesystem.plugin.FileSystemFactoryEnhance*' 
> implements '*FileSystemFactory*', when jm start, It will call 
> FileSystem.initialize(configuration, 
> PluginUtils.createPluginManagerFromRootFolder(configuration)) to load 
> factories to map  FileSystem#**{color}FS_FACTORIES, and the key is only 
> schema. When tm/jm use local hadoop conf A ,   the user code use hadoop conf 
> Bin 'filesystem plugin',  Conf A and Conf B is used to different hadoop 
> cluster. and The Jm will start failed, beacuse of the blodserver in JM will 
> load Conf B to get filesystem. the full log add appendix.
>  
> AS reslove method:
>     use  schema and authority as key for ' FileSystem#**FS_FACTORIES key'
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-15378) StreamFileSystemSink supported mutil hdfs plugins.

[jira] [Commented] (FLINK-15378) StreamFileSystemSink supported mutil hdfs plugins.

[jira] [Commented] (FLINK-15378) StreamFileSystemSink supported mutil hdfs plugins.

[jira] [Commented] (FLINK-15378) StreamFileSystemSink supported mutil hdfs plugins.

[jira] [Commented] (FLINK-15378) StreamFileSystemSink supported mutil hdfs plugins.

[jira] [Commented] (FLINK-15378) StreamFileSystemSink supported mutil hdfs plugins.

[jira] [Commented] (FLINK-15378) StreamFileSystemSink supported mutil hdfs plugins.

[jira] [Commented] (FLINK-15378) StreamFileSystemSink supported mutil hdfs plugins.

[jira] [Commented] (FLINK-15378) StreamFileSystemSink supported mutil hdfs plugins.

[jira] [Commented] (FLINK-15378) StreamFileSystemSink supported mutil hdfs plugins.

[jira] [Commented] (FLINK-15378) StreamFileSystemSink supported mutil hdfs plugins.

[jira] [Commented] (FLINK-15378) StreamFileSystemSink supported mutil hdfs plugins.

12 matches

Site Navigation

Mail list logo

Footer information