I like it. On Mon, Mar 2, 2015 at 5:12 AM, Vladimir Ozerov <[email protected]> wrote:
> HI all, > > We spend some time on discussions about file system and Hadoop APIs. There > were two possible ways to improve current non-obvious API. > First idea was to leave API more or less the same with only some cosmetic > changes, mainly class names. > Second idea was to remove all secondary file system configuration > parameters from IgfsConfiguration and move it to Hadoop module. Then IGFS > could be wired up with Hadoop secondary file system with help of some > private interface which are not exposed to users. > > I think the first solution is better because currently secondary file > system in IGFS is a kind of extension point. User is free to implement his > own secondary storage and use it in pretty the same way as store is used in > cache. I do not see any sensible reasons why we should remove this > extension point and hide it in Hadoop module. Therefore, I designed new API > using the first approach and the draft put into the branch ignite-386. > Please feel free to review and comment it. > > I'll also briefly go through the new design here: > > Core module: > 1) o.a.i.IgniteFileSystem - user interface to work with our native file > system. Obtained using Ignite.fileSystem() method > Based on "IgniteFs" and "Igfs" interfaces in current implementation > > 2) o.a.i.filesystem.SecondaryFileSystem - API for creating secondary file > systems for IGFS. > Based on "Igfs" interface in current implementation. > > Note that there is no more direct link between IgniteFileSystem and > SecondaryFileSystem, as these are completely different entities. > > 3) o.a.i.configuration.FileSystemConfiguration - configuration bean for > IgniteFileSystem. It has setter > "setSecondaryFileSystem(SecondaryFileSystem)". > > Hadoop module: > 1) There are 4 map-reduce classes under o.a.i.hadoop.mapreduce package. > Their packages reflect corresponding packages in Hadoop API. E.g.: > org.apache.ignite.[hadoop.mapreduce.protocol.IgniteHadoopClientProtocol] > implements org.apache.[hadoop.mapreduce.protocol.ClientProtocol]. > > 2) Two file system implementations named "IgniteHadoopFileSystem" for v1 > and v2 Hadoops. > > 3) IgniteHadoopSecondaryFileSystem - implementation of SecondaryFileSystem > from core module, which is capable of delegating native IGFS calls to > underlying Hadoop FileSystem. > It is named "IgfsHadoopFileSystemWrapper" in current implementation. > > Let me give you an example of how user is going to configure it now. > > 1) Ignite configuration: > <bean class="org.apache.ignite.configuration.IgniteConfiguration"> > <property name="fileSystemConfiguration"> > <list> > <bean > class="org.apache.ignite.configuration.FileSystemConfiguration"> > <!-- Delegate to real HDFS. --> > <property name="secondaryFileSystem"> > <bean > class="örg.apache.ignite.hadoop.fs.IgniteHadoopSecondaryFileSystem"> > <constructor-arg value="hdfs://192.168.1.23"/> > </bean> > </property> > </bean> > </list> > </property> > </bean> > > 2) core-site.xml: > <configuration> > <property> > <name>fs.default.name</name> > <value>igfs:///</value> > </property> > <property> > <name>fs.igfs.impl</name> > > <value>org.apache.ignite.igfs.hadoop.v1.IgfsHadoopFileSystem</value> > </property> > <property> > <name>fs.AbstractFileSystem.igfs.impl</name> > > <value>org.apache.ignite.igfs.hadoop.v2.IgfsHadoopFileSystem</value> > </property> > </configuration> > > Seems pretty clear and consistent to me. > > Thoughts? > > Vladimir. >
