[
https://issues.apache.org/jira/browse/FLINK-15378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17006692#comment-17006692
]
Piotr Nowojski commented on FLINK-15378:
----------------------------------------
Thanks for the explanations, I think I know understand the issue.
For one thing, the current approach in the [proposed
PR|https://github.com/apache/flink/pull/10686/] is not generic enough. It
limits the support for different configurations to just {{StreamingFileSink}}.
If we allow to identify plugins by parts from the URI (for example {{host}} or
{{port}} as suggested by [~fly_in_gis] ), that would be better.
However I see couple of issues/follow up thoughts.
For example, we would probably need some config file, that would say, that if
you are using {{hdfs}} to talk to {{namenode1}} you must use {{conf A}}, while
if you are writing to {{namenode2}} you should use {{conf B}}. I'm not sure how
to express this. Just copying pasting whole fat jar two different plugins
directories, with two different configs is one option, but...
I don't think changes in configuration, like different {{hdfs-site.xml}},
should enforce creation of another fat-jar, for the same reason as:
{quote}
They share the same schema "hdfs" and it will be not convenient and confusing
for users if we changes the schema.
{quote}
I agree both sinks writing to {{namenode1}} with {{conf A}} and to
{{namenode2}} with {{conf B}} should be using the same schema, but they should
also be using same plugin.
I have to think a bit about this. Maybe we should decouple concept of plugin
from a concept of the filesystem - one plugin could be used by different file
system instances.
> StreamFileSystemSink supported mutil hdfs plugins.
> --------------------------------------------------
>
> Key: FLINK-15378
> URL: https://issues.apache.org/jira/browse/FLINK-15378
> Project: Flink
> Issue Type: Improvement
> Components: Connectors / FileSystem, FileSystems
> Affects Versions: 1.9.2, 1.10.0
> Reporter: ouyangwulin
> Priority: Major
> Labels: pull-request-available
> Fix For: 1.11.0
>
> Attachments: jobmananger.log
>
> Time Spent: 10m
> Remaining Estimate: 0h
>
> [As report from
> maillist|[https://lists.apache.org/thread.html/7a6b1e341bde0ef632a82f8d46c9c93da358244b6bac0d8d544d11cb%40%3Cuser.flink.apache.org%3E]]
> Request 1: FileSystem plugins not effect the default yarn dependecies.
> Request 2: StreamFileSystemSink supported mutil hdfs plugins under the same
> schema
> As Problem describe :
> when I put a ' filesystem plugin to FLINK_HOME/pulgins in flink', and the
> clas{color:#172b4d}s '*com.filesystem.plugin.FileSystemFactoryEnhance*'
> implements '*FileSystemFactory*', when jm start, It will call
> FileSystem.initialize(configuration,
> PluginUtils.createPluginManagerFromRootFolder(configuration)) to load
> factories to map FileSystem#**{color}FS_FACTORIES, and the key is only
> schema. When tm/jm use local hadoop conf A , the user code use hadoop conf
> Bin 'filesystem plugin', Conf A and Conf B is used to different hadoop
> cluster. and The Jm will start failed, beacuse of the blodserver in JM will
> load Conf B to get filesystem. the full log add appendix.
>
> AS reslove method:
> use schema and spec identify as key for ' FileSystem#**FS_FACTORIES '
>
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)