The requirement is simple. We need to generate log files on a per tenant,
per date, per Service basis. Now as a big data & analytics expert, please
advise us on what is the best solution for this.

Azeez

On Mon, Jul 23, 2012 at 6:05 PM, Tharindu Mathew <[email protected]> wrote:

> So through this custom java task, what is the scale of log processing you
> will support? 100MB, 1 GB, 100 GB, 1 TB?
>
> On Mon, Jul 23, 2012 at 5:14 PM, Manisha Gayathri <[email protected]>wrote:
>
>> Contacted Hive User Group as well on this matter.
>> They also mentioned that this approach is not possible.
>> Also as per the chat I had with Buddhika, right now, these kind of
>> dynamic variable creations is not possible in Hive that comes with BAM2.
>>
>> Therefore IMO, without going ahead with this cumbersome process, the best
>> way will be to run a scheduled java task to pick data from relevant
>> Cassandra Column families and dynamically generate the relevant log files
>> (according to the tenantID and current date) which will be stored in Apache
>> Directory.
>>
> You are going to store the results in a LDAP?
>
>>
>> As per the offline chat had with Azeez, will start to work on a custom
>> Java task that can handle the above scenario.
>>
>> On Mon, Jul 23, 2012 at 2:27 PM, Manisha Gayathri <[email protected]>wrote:
>>
>>> Hi,
>>>
>>> For a log file storing scenario using BAM2, I have a requirement to
>>> generate separate log files for each date. For that I have created a Hive
>>> Analytic query along with a Hive UDF as well.
>>>
>>> I have the getFilePath function which should return a URL like this.
>>>
>>> home/user/Desktop/logDir/logs/log_0_testServer_2012_07_22
>>>
>>> The defined function works perfectly if I put *getFilePath(
>>> "0","testServer" ) *into the *select* statement.
>>>
>>> But I want to get that particular URL as the *local directory name*.
>>> (The requirement is such that this should not be hard-coded in the hive
>>> query. Rather should be generated in the custom UDF. )
>>>
>>> So can I do something like I v shown below?
>>>
>>> *set file_name= getFilePath( "0","testServer" );    *//Define a
>>> parameter.* *
>>> *.................*
>>> *..............*
>>> *INSERT OVERWRITE LOCAL DIRECTORY 'file:///${hiveconf:file_name}'
>>>              *//Assign the above parameter as the file URL
>>>
>>> I tried this way. But the directory name is returned as
>>>
>>> file:/getFilePath( "0" , "testServer" )
>>>
>>> Does that mean I cannot use UDF to define the local directory name?
>>> Or am I doing anything wrong in here?
>>>
>>>
>>> --
>>> ~Regards
>>> *Manisha Eleperuma*
>>> Software Engineer
>>> WSO2, Inc.: http://wso2.com
>>> lean.enterprise.middleware
>>>
>>> *
>>> *
>>>
>>>
>>
>>
>> --
>> ~Regards
>> *Manisha Eleperuma*
>> Software Engineer
>> WSO2, Inc.: http://wso2.com
>> lean.enterprise.middleware
>>
>> *
>> *
>> *
>> *
>>
>>
>> _______________________________________________
>> Dev mailing list
>> [email protected]
>> http://wso2.org/cgi-bin/mailman/listinfo/dev
>>
>>
>
>
> --
> Regards,
>
> Tharindu
>
> blog: http://mackiemathew.com/
> M: +94777759908
>
>
> _______________________________________________
> Dev mailing list
> [email protected]
> http://wso2.org/cgi-bin/mailman/listinfo/dev
>
>


-- 
*Afkham Azeez*
Director of Architecture; WSO2, Inc.; http://wso2.com
Member; Apache Software Foundation; http://www.apache.org/
* <http://www.apache.org/>**
email: **[email protected]* <[email protected]>* cell: +94 77 3320919
blog: **http://blog.afkham.org* <http://blog.afkham.org>*
twitter: **http://twitter.com/afkham_azeez*<http://twitter.com/afkham_azeez>
*
linked-in: **http://lk.linkedin.com/in/afkhamazeez*
*
*
*Lean . Enterprise . Middleware*
_______________________________________________
Dev mailing list
[email protected]
http://wso2.org/cgi-bin/mailman/listinfo/dev

Reply via email to