On Fri, Sep 11, 2009 at 3:26 PM, Prasad Chakka <[email protected]> wrote:
> You should create a daily partition table. So you just need to create a new
> partition which is automatic if you use ‘LOAD DATA... INTO TABLE ...
> PARTITION (ds=’2009-09-01’)’
>
> Prasad
>
>
> ________________________________
> From: Mayuran Yogarajah <[email protected]>
> Reply-To: <[email protected]>
> Date: Fri, 11 Sep 2009 12:20:25 -0700
> To: <[email protected]>
> Subject: General design/schema question
>
> We have our files in HDFS laid out by day like this:
>
> 2009-09-01/files
> 2009-09-02/files
> 2009-09-03/files
>
> Loading this data into Hive would mean creating a new table per day!
>
> I'm thinking this might be a common issue though, since others most likely
> do batch processing on a daily/nightly basis.  Is there any way to have the
> data in Hive without creating a new table per day ?
>
> thanks
>
>
I went DAY/HOUR, in this way I can do hourly queries in a short amount
of time. We also have 5 minute logs. So each hour partition holds 12
files per web server.

Reply via email to