The requirement is simple. We need to generate log files on a per tenant, per date, per Service basis. Now as a big data & analytics expert, please advise us on what is the best solution for this.
Azeez On Mon, Jul 23, 2012 at 6:05 PM, Tharindu Mathew <[email protected]> wrote: > So through this custom java task, what is the scale of log processing you > will support? 100MB, 1 GB, 100 GB, 1 TB? > > On Mon, Jul 23, 2012 at 5:14 PM, Manisha Gayathri <[email protected]>wrote: > >> Contacted Hive User Group as well on this matter. >> They also mentioned that this approach is not possible. >> Also as per the chat I had with Buddhika, right now, these kind of >> dynamic variable creations is not possible in Hive that comes with BAM2. >> >> Therefore IMO, without going ahead with this cumbersome process, the best >> way will be to run a scheduled java task to pick data from relevant >> Cassandra Column families and dynamically generate the relevant log files >> (according to the tenantID and current date) which will be stored in Apache >> Directory. >> > You are going to store the results in a LDAP? > >> >> As per the offline chat had with Azeez, will start to work on a custom >> Java task that can handle the above scenario. >> >> On Mon, Jul 23, 2012 at 2:27 PM, Manisha Gayathri <[email protected]>wrote: >> >>> Hi, >>> >>> For a log file storing scenario using BAM2, I have a requirement to >>> generate separate log files for each date. For that I have created a Hive >>> Analytic query along with a Hive UDF as well. >>> >>> I have the getFilePath function which should return a URL like this. >>> >>> home/user/Desktop/logDir/logs/log_0_testServer_2012_07_22 >>> >>> The defined function works perfectly if I put *getFilePath( >>> "0","testServer" ) *into the *select* statement. >>> >>> But I want to get that particular URL as the *local directory name*. >>> (The requirement is such that this should not be hard-coded in the hive >>> query. Rather should be generated in the custom UDF. ) >>> >>> So can I do something like I v shown below? >>> >>> *set file_name= getFilePath( "0","testServer" ); *//Define a >>> parameter.* * >>> *.................* >>> *..............* >>> *INSERT OVERWRITE LOCAL DIRECTORY 'file:///${hiveconf:file_name}' >>> *//Assign the above parameter as the file URL >>> >>> I tried this way. But the directory name is returned as >>> >>> file:/getFilePath( "0" , "testServer" ) >>> >>> Does that mean I cannot use UDF to define the local directory name? >>> Or am I doing anything wrong in here? >>> >>> >>> -- >>> ~Regards >>> *Manisha Eleperuma* >>> Software Engineer >>> WSO2, Inc.: http://wso2.com >>> lean.enterprise.middleware >>> >>> * >>> * >>> >>> >> >> >> -- >> ~Regards >> *Manisha Eleperuma* >> Software Engineer >> WSO2, Inc.: http://wso2.com >> lean.enterprise.middleware >> >> * >> * >> * >> * >> >> >> _______________________________________________ >> Dev mailing list >> [email protected] >> http://wso2.org/cgi-bin/mailman/listinfo/dev >> >> > > > -- > Regards, > > Tharindu > > blog: http://mackiemathew.com/ > M: +94777759908 > > > _______________________________________________ > Dev mailing list > [email protected] > http://wso2.org/cgi-bin/mailman/listinfo/dev > > -- *Afkham Azeez* Director of Architecture; WSO2, Inc.; http://wso2.com Member; Apache Software Foundation; http://www.apache.org/ * <http://www.apache.org/>** email: **[email protected]* <[email protected]>* cell: +94 77 3320919 blog: **http://blog.afkham.org* <http://blog.afkham.org>* twitter: **http://twitter.com/afkham_azeez*<http://twitter.com/afkham_azeez> * linked-in: **http://lk.linkedin.com/in/afkhamazeez* * * *Lean . Enterprise . Middleware*
_______________________________________________ Dev mailing list [email protected] http://wso2.org/cgi-bin/mailman/listinfo/dev
