Hi Gokul, As per my understanding, the dfs.data.dir specifies where the data has to be stored in the local file system, correct? If not, I think the parameter name should change as in HDFS configurations it is mentioned as " dfs.data.dir in the file /etc/hadoop/conf/hdfs-site.xml
*determines where on the local filesystem an DFS data node should store its blocks. If this is a comma-delimited list of directories, then data will be stored in all named directories, typically on different devices. Directories that do not exist are ignored." and it would not be very intuitive.*What I am wondering is that, isn't it necessary to tell the data source, that you will be looking at the file structure on the name node from this point onwards? Thinking of file permissions it just crossed my mind. Is this a valid scenario? I was looking at the reference[1] and they seem to have the attribute I am talking about. for e.g. the file structure is as follows, I only want my data source to be able to access the remote HDFS at testDir1, and not from / or from testDir2. -/ |_testDir1 |_testDir2 [1] http://documentation.platfora.com/webdocs/index.html#data_source/add_hdfs_datasource.html On Thu, May 21, 2015 at 12:29 PM, Gokul Balakrishnan <[email protected]> wrote: > Hi Shani, > > Yes, we can specify the location using the dfs.data.dir property. In the > implementation, we will be creating a FileSystem object in the datasource > reader component and passing it for use by other components. During > creation of this FileSystem object, we will use the Apache Commons > Configuration object adorned with all properties specified for the > datasource configuration. Therefore, if we set this property, we can > restrict the consumer to whichever directory in the Hadoop filesystem > specified for it. > > Thanks, > > On 20 May 2015 at 22:16, Shani Ranasinghe <[email protected]> wrote: > >> Hi Gokul, >> >> When allowing a user to connect to a remote HDFS namenode, are we >> assuming that it will always access the root ("/") ? or do we have a way >> to specify which folder the data source should point to? >> >> On Thu, May 21, 2015 at 12:05 PM, Gokul Balakrishnan <[email protected]> >> wrote: >> >>> Hi Srinath/Anjana, >>> >>> Sorry for the delay. Yes, we're changing from Hadoop as part of the >>> improvements, where we will have the same underlying implementation for the >>> config but with two DatasourceReader implementations, one each for HBase >>> and HDFS respectively. >>> >>> We are also modifying the format of the expected configuration to better >>> match the properties specified in other Carbon datasource configurations, >>> so the updated config would look something like: >>> >>> <datasource> >>> <name>WSO2_ANALYTICS_FS_DB_HDFS</name> >>> <description>The datasource used for analytics file system</ >>> description> >>> <jndiConfig> >>> <name>jdbc/WSO2HDFSDB</name> >>> </jndiConfig> >>> <definition type="HDFS"> >>> <configuration> >>> <property name="fs.default.name" value >>> ="hdfs://localhost:9000" /> >>> <property name="dfs.data.dir" value="/dfs/data" /> >>> <property name="fs.hdfs.impl" value >>> ="org.apache.hadoop.hdfs.DistributedFileSystem" /> >>> <property name="fs.file.impl" value >>> ="org.apache.hadoop.fs.LocalFileSystem" /> >>> </configuration> >>> </definition> >>> </datasource> >>> >>> Thanks, >>> >>> On 14 May 2015 at 00:52, Anjana Fernando <[email protected]> wrote: >>> >>>> Hi Srinath, >>>> >>>> Yeah, I'd a chat with Gokul yesterday, we are changing this to HDFS and >>>> also having another HBase one as well, I think he has already done the >>>> changes. @Gokul, please send the updated information. >>>> >>>> Cheers, >>>> Anjana. >>>> >>>> On Thu, May 14, 2015 at 1:10 PM, Srinath Perera <[email protected]> >>>> wrote: >>>> >>>>> Can we call type HDFS instead of Hadoop? ( if we can change that >>>>> without much trouble) >>>>> >>>>> On Tue, May 12, 2015 at 8:38 PM, Gokul Balakrishnan <[email protected]> >>>>> wrote: >>>>> >>>>>> Hi all, >>>>>> >>>>>> As part of the HBase analytics datasource implementation for DAS 3.0, >>>>>> we have come up with $subject which is envisioned to offer a standardised >>>>>> way to specify connectivity parameters for a remote Hadoop-based instance >>>>>> in a Carbon datasource configuration. >>>>>> >>>>>> The datasource reader will expect the configuration to be specified >>>>>> in a similar format which is used for standard Apache Commons >>>>>> Configuration >>>>>> [1], as used by both HDFS and HBase. An example datasource definition >>>>>> would >>>>>> look like: >>>>>> >>>>>> <datasource> >>>>>> <name>WSO2_ANALYTICS_FS_DB_HDFS</name> >>>>>> <description>The datasource used for analytics file system</ >>>>>> description> >>>>>> <jndiConfig> >>>>>> <name>jdbc/WSO2HDFSDB</name> >>>>>> </jndiConfig> >>>>>> <definition type="HADOOP"> >>>>>> <configuration> >>>>>> <property> >>>>>> <name>fs.default.name</name> >>>>>> <value>hdfs://localhost:9000</value> >>>>>> </property> >>>>>> <property> >>>>>> <name>dfs.data.dir</name> >>>>>> <value>/dfs/data</value> >>>>>> </property> >>>>>> <property> >>>>>> <name>fs.hdfs.impl</name> >>>>>> <value>org.apache.hadoop.hdfs.DistributedFileSystem</ >>>>>> value> >>>>>> </property> >>>>>> <property> >>>>>> <name>fs.file.impl</name> >>>>>> <value>org.apache.hadoop.fs.LocalFileSystem</value> >>>>>> </property> >>>>>> </configuration> >>>>>> </definition> >>>>>> </datasource> >>>>>> >>>>>> The definition type for the above is set as "HADOOP". The datasource >>>>>> reader implementation is currently hosted at [2], and would be merged >>>>>> with >>>>>> the carbon-data git repo once reviewed. >>>>>> >>>>>> Appreciate your thought and suggestions. >>>>>> >>>>>> Thanks, >>>>>> Gokul. >>>>>> >>>>>> [1] http://commons.apache.org/proper/commons-configuration/ >>>>>> >>>>>> [2] >>>>>> https://github.com/gokulbs/carbon-data/tree/master/components/data-sources/org.wso2.carbon.datasource.reader.hadoop >>>>>> >>>>>> -- >>>>>> Balakrishnan Gokulakrishnan >>>>>> Senior Software Engineer, >>>>>> WSO2, Inc. http://wso2.com >>>>>> Mob: +94 77 593 5789 | +1 650 272 9927 >>>>>> >>>>>> _______________________________________________ >>>>>> Architecture mailing list >>>>>> [email protected] >>>>>> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture >>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> ============================ >>>>> Srinath Perera, Ph.D. >>>>> http://people.apache.org/~hemapani/ >>>>> http://srinathsview.blogspot.com/ >>>>> >>>>> _______________________________________________ >>>>> Architecture mailing list >>>>> [email protected] >>>>> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture >>>>> >>>>> >>>> >>>> >>>> -- >>>> *Anjana Fernando* >>>> Senior Technical Lead >>>> WSO2 Inc. | http://wso2.com >>>> lean . enterprise . middleware >>>> >>> >>> >>> >>> -- >>> Balakrishnan Gokulakrishnan >>> Senior Software Engineer, >>> WSO2, Inc. http://wso2.com >>> Mob: +94 77 593 5789 | +1 650 272 9927 >>> >>> _______________________________________________ >>> Architecture mailing list >>> [email protected] >>> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture >>> >>> >> >> >> -- >> Thanks and Regards >> *,Shani Ranasinghe* >> Senior Software Engineer >> WSO2 Inc.; http://wso2.com >> lean.enterprise.middleware >> >> mobile: +94 77 2273555 >> Blog: http://waysandmeans.blogspot.com/ >> linked in: lk.linkedin.com/pub/shani-ranasinghe/34/111/ab >> >> _______________________________________________ >> Architecture mailing list >> [email protected] >> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture >> >> > > > -- > Gokul Balakrishnan > Senior Software Engineer, > WSO2, Inc. http://wso2.com > Mob: +94 77 593 5789 | +1 650 272 9927 > > _______________________________________________ > Architecture mailing list > [email protected] > https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture > > -- Thanks and Regards *,Shani Ranasinghe* Senior Software Engineer WSO2 Inc.; http://wso2.com lean.enterprise.middleware mobile: +94 77 2273555 Blog: http://waysandmeans.blogspot.com/ linked in: lk.linkedin.com/pub/shani-ranasinghe/34/111/ab
_______________________________________________ Architecture mailing list [email protected] https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
