Ivan, I think this should be documented, no? On Mon, Dec 14, 2015 at 2:25 AM, Ivan V. <iveselovs...@gridgain.com> wrote:
> To enable just an IGFS persistence there is no need to use HDFS (this > requires Hadoop dependency, requires configured HDFS cluster, etc.). > We have requests https://issues.apache.org/jira/browse/IGNITE-1120 , > https://issues.apache.org/jira/browse/IGNITE-1926 to implement the > persistence upon local file system, and we already close to the solution. > > Regarding the secondary Fs doc page ( > http://apacheignite.gridgain.org/docs/secondary-file-system) I would > suggest to add the following text there: > ------------------------ > If Ignite node with secondary file system configured on a machine with > Hadoop distribution, make sure Ignite is able to find appropriate Hadoop > libraries: set HADOOP_HOME environment variable for the Ignite process if > you're using Apache Hadoop distribution, or, if you use another > distribution (HDP, Cloudera, BigTop, etc.) make sure /etc/default/hadoop > file exists and has appropriate contents. > > If Ignite node with secondary file system configured on a machine without > Hadoop distribution, you can manually add necessary Hadoop dependencies to > Ignite node classpath: these are dependencies of groupId > "org.apache.hadoop" listed in file modules/hadoop/pom.xml . Currently they > are: > > 1. hadoop-annotations > 2. hadoop-auth > 3. hadoop-common > 4. hadoop-hdfs > 5. hadoop-mapreduce-client-common > 6. hadoop-mapreduce-client-core > > ------------------------ > > On Mon, Dec 14, 2015 at 11:21 AM, Valentin Kulichenko < > valentin.kuliche...@gmail.com> wrote: > > > Guys, > > > > Why don't we include ignite-hadoop module in Fabric? This user simply > wants > > to configure HDFS as a secondary file system to ensure persistence. Not > > having the opportunity to do this in Fabric looks weird to me. And > actually > > I don't think this is a use case for Hadoop Accelerator. > > > > -Val > > > > On Mon, Dec 14, 2015 at 12:11 AM, Denis Magda <dma...@gridgain.com> > wrote: > > > > > Hi Ivan, > > > > > > 1) Yes, I think that it makes sense to have the old versions of the > docs > > > while an old version is still considered to be used by someone. > > > > > > 2) Absolutely, the time to add a corresponding article on the > readme.io > > > has come. It's not the first time I see the question related to HDFS > as a > > > secondary FS. > > > Before and now it's not clear for me what exact steps I should follow > to > > > enable such a configuration. Our current suggestions look like a > puzzle. > > > I'll assemble the puzzle on my side and prepare the article. Ivan if > you > > > don't mind I would reaching you out directly asking for any technical > > > assistance if needed. > > > > > > Regards, > > > Denis > > > > > > > > > On 12/14/2015 10:25 AM, Ivan V. wrote: > > > > > >> Hi, Valentin, > > >> > > >> 1) first of all note that the author of the question uses not the > latest > > >> doc page, namely > > >> http://apacheignite.gridgain.org/v1.0/docs/igfs-secondary-file-system > . > > >> This is version 1.0, while the latest is 1.5: > > >> https://apacheignite.readme.io/docs/hadoop-accelerator. Besides, it > > >> appeared that some links from the latest doc version point to 1.0 doc > > >> version. I fixed that in several places where I found that. Do we > really > > >> need old doc versions (1.0 -1.4)? > > >> > > >> 2) our documentation ( > > >> http://apacheignite.gridgain.org/docs/secondary-file-system) does not > > >> provide any special setup instructions to configure HDFS as secondary > > file > > >> system in Ignite. Our docs assume that if a user wants to integrate > with > > >> Hadoop, (s)he follows generic Hadoop integration instruction (e.g. > > >> http://apacheignite.gridgain.org/docs/installing-on-apache-hadoop). > It > > >> looks like the page > > >> http://apacheignite.gridgain.org/docs/secondary-file-system should be > > >> more > > >> clear regarding the required configuration steps (in fact, setting up > > >> HADOOP_HOME variable for Ignite node process). > > >> > > >> 3) Hadoop jars are correctly found by Ignite if the following > conditions > > >> are met: > > >> (a) The "Hadoop Edition" distribution is used (not a "Fabric" > edition). > > >> (b) Either HADOOP_HOME environment variable is set up (for Apache > Hadoop > > >> distribution), or file "/etc/default/hadoop" exists and matches the > > Hadoop > > >> distribution used (BigTop, Cloudera, HDP, etc.) > > >> > > >> The exact mechanism of the Hadoop classpath composition can be found > in > > >> files > > >> IGNITE_HOME/bin/include/hadoop-classpath.sh > > >> IGNITE_HOME/bin/include/setenv.sh . > > >> > > >> The issue is discussed in > > >> https://issues.apache.org/jira/browse/IGNITE-372 > > >> , https://issues.apache.org/jira/browse/IGNITE-483 . > > >> > > >> On Sat, Dec 12, 2015 at 3:45 AM, Valentin Kulichenko < > > >> valentin.kuliche...@gmail.com> wrote: > > >> > > >> Igniters, > > >>> > > >>> I'm looking at the question on SO [1] and I'm a bit confused. > > >>> > > >>> We ship ignite-hadoop module only in Hadoop Accelerator and without > > >>> Hadoop > > >>> JARs, assuming that user will include them from the Hadoop > distribution > > >>> he > > >>> uses. It seems OK for me when accelerator is plugged in to Hadoop to > > run > > >>> mapreduce jobs, but I can't figure out steps required to configure > HDFS > > >>> as > > >>> a secondary FS for IGFS. Which Hadoop JARs should be on classpath? Is > > >>> user > > >>> supposed to add them manually? > > >>> > > >>> Can someone with more expertise in our Hadoop integration clarify > > this? I > > >>> believe there is not enough documentation on this topic. > > >>> > > >>> BTW, any ideas why user gets exception for JobConf class which is in > > >>> 'mapred' package? Why map-reduce class is being used? > > >>> > > >>> [1] > > >>> > > >>> > > >>> > > > http://stackoverflow.com/questions/34221355/apache-ignite-what-are-the-dependencies-of-ignitehadoopigfssecondaryfilesystem > > >>> > > >>> -Val > > >>> > > >>> > > > > > >