Hi
     If you have log files enough to become at least one block size in an hour. 
You can go ahead as
- run a scheduled job every hour that compresses the log files for that hour 
and stores them on to hdfs (can use LZO or even Snappy to compress)
- if your hive does more frequent analysis on this data store it as PARTITIONED 
BY (Date,Hour) . While loading into hdfs also follow a directory - sub dir 
structure. Once data is in hdfs issue a Alter Table Add Partition statement on 
corresponding hive table.
-in Hive DDL use the appropriate Input format (Hive has some ApacheLog Input 
Format already)


Regards
Bejoy K S

From handheld, Please excuse typos.

-----Original Message-----
From: Xiaobin She <[email protected]>
Date: Mon, 6 Feb 2012 16:41:50 
To: <[email protected]>; 佘晓彬<[email protected]>
Reply-To: [email protected]
Subject: Re: Can I write to an compressed file which is located in hdfs?

sorry, this sentence is wrong,

I can't compress these logs every hour and them put them into hdfs.

it should be

I can  compress these logs every hour and them put them into hdfs.




2012/2/6 Xiaobin She <[email protected]>

>
> hi all,
>
> I'm testing hadoop and hive, and I want to use them in log analysis.
>
> Here I have a question, can I write/append log to  an compressed file
> which is located in hdfs?
>
> Our system generate lots of log files every day, I can't compress these
> logs every hour and them put them into hdfs.
>
> But what if I want to write logs into files that was already in the hdfs
> and was compressed?
>
> Is these files were not compressed, then this job seems easy, but how to
> write or append logs into an compressed log?
>
> Can I do that?
>
> Can anyone give me some advices or give me some examples?
>
> Thank you very much!
>
> xiaobin
>

Reply via email to