I'm no expert, but I immediately question the scale of this approach. Do you have an idea of how much of logs you plan to process per task?
On Mon, Jul 23, 2012 at 6:13 PM, Afkham Azeez <[email protected]> wrote: > The requirement is simple. We need to generate log files on a per tenant, > per date, per Service basis. Now as a big data & analytics expert, please > advise us on what is the best solution for this. > > Azeez > > > On Mon, Jul 23, 2012 at 6:05 PM, Tharindu Mathew <[email protected]>wrote: > >> So through this custom java task, what is the scale of log processing you >> will support? 100MB, 1 GB, 100 GB, 1 TB? >> >> On Mon, Jul 23, 2012 at 5:14 PM, Manisha Gayathri <[email protected]>wrote: >> >>> Contacted Hive User Group as well on this matter. >>> They also mentioned that this approach is not possible. >>> Also as per the chat I had with Buddhika, right now, these kind of >>> dynamic variable creations is not possible in Hive that comes with BAM2. >>> >>> Therefore IMO, without going ahead with this cumbersome process, the >>> best way will be to run a scheduled java task to pick data from relevant >>> Cassandra Column families and dynamically generate the relevant log files >>> (according to the tenantID and current date) which will be stored in Apache >>> Directory. >>> >> You are going to store the results in a LDAP? >> >>> >>> As per the offline chat had with Azeez, will start to work on a custom >>> Java task that can handle the above scenario. >>> >>> On Mon, Jul 23, 2012 at 2:27 PM, Manisha Gayathri <[email protected]>wrote: >>> >>>> Hi, >>>> >>>> For a log file storing scenario using BAM2, I have a requirement to >>>> generate separate log files for each date. For that I have created a Hive >>>> Analytic query along with a Hive UDF as well. >>>> >>>> I have the getFilePath function which should return a URL like this. >>>> >>>> home/user/Desktop/logDir/logs/log_0_testServer_2012_07_22 >>>> >>>> The defined function works perfectly if I put *getFilePath( >>>> "0","testServer" ) *into the *select* statement. >>>> >>>> But I want to get that particular URL as the *local directory name*. >>>> (The requirement is such that this should not be hard-coded in the hive >>>> query. Rather should be generated in the custom UDF. ) >>>> >>>> So can I do something like I v shown below? >>>> >>>> *set file_name= getFilePath( "0","testServer" ); *//Define a >>>> parameter.* * >>>> *.................* >>>> *..............* >>>> *INSERT OVERWRITE LOCAL DIRECTORY 'file:///${hiveconf:file_name}' >>>> *//Assign the above parameter as the file URL >>>> >>>> I tried this way. But the directory name is returned as >>>> >>>> file:/getFilePath( "0" , "testServer" ) >>>> >>>> Does that mean I cannot use UDF to define the local directory name? >>>> Or am I doing anything wrong in here? >>>> >>>> >>>> -- >>>> ~Regards >>>> *Manisha Eleperuma* >>>> Software Engineer >>>> WSO2, Inc.: http://wso2.com >>>> lean.enterprise.middleware >>>> >>>> * >>>> * >>>> >>>> >>> >>> >>> -- >>> ~Regards >>> *Manisha Eleperuma* >>> Software Engineer >>> WSO2, Inc.: http://wso2.com >>> lean.enterprise.middleware >>> >>> * >>> * >>> * >>> * >>> >>> >>> _______________________________________________ >>> Dev mailing list >>> [email protected] >>> http://wso2.org/cgi-bin/mailman/listinfo/dev >>> >>> >> >> >> -- >> Regards, >> >> Tharindu >> >> blog: http://mackiemathew.com/ >> M: +94777759908 >> >> >> _______________________________________________ >> Dev mailing list >> [email protected] >> http://wso2.org/cgi-bin/mailman/listinfo/dev >> >> > > > -- > *Afkham Azeez* > Director of Architecture; WSO2, Inc.; http://wso2.com > Member; Apache Software Foundation; http://www.apache.org/ > * <http://www.apache.org/>** > email: **[email protected]* <[email protected]>* cell: +94 77 3320919 > blog: **http://blog.afkham.org* <http://blog.afkham.org>* > twitter: **http://twitter.com/afkham_azeez*<http://twitter.com/afkham_azeez> > * > linked-in: **http://lk.linkedin.com/in/afkhamazeez* > * > * > *Lean . Enterprise . Middleware* > > -- Regards, Tharindu blog: http://mackiemathew.com/ M: +94777759908
_______________________________________________ Dev mailing list [email protected] http://wso2.org/cgi-bin/mailman/listinfo/dev
