On Wed, Sep 16, 2009 at 10:42 PM, 김영우 <[email protected]> wrote:
> Hi Edward,
>
> It would be nice and very useful. sometimes I want to select my own
> 'partition' or 'datafile' explicitly.. something like below:
>
> SELECT * FROM weblogs PARTITION ('2009-09-17', '2009-09-18') WHERE
> col1='..' and col2= ...
>
> Or users can select data files from directory:
>
> SELECT * FROM weblogs DATAFILE ('log1.txt', 'log2.txt') WHERE col1='..' and
> col2= ...
>
> Anyway, your idea is very cool!
>
> Youngwoo
>
> 2009/9/17 Edward Capriolo <[email protected]>
>>
>> I am dumping files into a hive partion on five minute intervals. I am
>> using LOAD DATA into a partition.
>>
>> weblogs
>> web1.00
>> web1.05
>> web1.10
>> ...
>> web2.00
>> web2.05
>> web1.10
>> ....
>>
>> Things that would be useful..
>>
>> Select files from the folder with a regex or exact name
>>
>> select * FROM logs where FILENAME LIKE(WEB1*)
>>
>> select * FROM LOGS WHERE FILENAME=web2.00
>>
>> Also it would be nice to be able to select offsets in a file, this
>> would make sense with appends
>>
>> select * from logs WHERE FILENAME=web2.00 FROMOFFSET=454644 [TOOFFSET=]
>>
>> Do these make sense to anyone?
>>
>> Edward
>
>
I added your comments to
https://issues.apache.org/jira/browse/HIVE-837
Depending on how you are setup you can do this with a where clause
SELECT * FROM weblogs PARTITION ('2009-09-17'
For example I partion by date and by hour
partition (log_date_part string, log_hour_part string)
select * from table where log_date_part like ('2009%')
or
select * from table where log_date_part = '2009-05-05' OR
log_date_part = '2009-05-06'
So you should be able to do that already.