Akshay,

That's great news!

On Tue, Apr 27, 2021 at 1:10 PM Akshay Bhasin (BLOOMBERG/ 731 LEX) <
abhasi...@bloomberg.net> wrote:

> Hi Ted,
>
> Thanks for reaching out. Yes - the below worked successfully.
>
> I was able to create different objects in s3 like 'XXX/YYY/filename',
> 'XXX/ZZZ/filename' and able to query like
> SELECT * FROM XXX.
>
> Thanks !
>
> Best,
> Akshay
>
> From: ted.dunn...@gmail.com At: 04/21/21 17:21:42
> To: Akshay Bhasin (BLOOMBERG/ 731 LEX ) <abhasi...@bloomberg.net>,
> dev@drill.apache.org
> Subject: Re: Feature/Question
>
>
> Akshay,
>
> Yes. It is possible to do what you want from a few different angles.
>
> As you have noted, S3 doesn't have directories. Not really. On the other
> hand, people simulate this using naming schemes and S3 has some support for
> this.
>
> One of the simplest ways to deal with this is to create a view that
> explicitly mentions every S3 object that you have in your table. The
> contents of this view can get a bit cumbersome, but that shouldn't be a
> problem since users never need to know. You will need to set up a scheduled
> action to update this view occasionally, but that is pretty simple.
>
> The other way is to use a naming scheme with a delimiter such as /. This
> is described at
> https://docs.aws.amazon.com/AmazonS3/latest/userguide/using-prefixes.html
> If you do that and have files named (for instance) foo/a.json, foo/b.json,
> foo/c.json and you query
>
>     select * from s3.`foo`
>
> you should see the contents of a.json, b.json and c.json. See here for
> commentary
> <https://stackoverflow.com/questions/44785065/apache-drill-how-to-query-all-files-in-an-s3-bucket>
>
> I haven't tried this, however, so I am simply going on the reports of
> others. If this works for you, please report your success back here.
>
>
>
>
>
> On Wed, Apr 21, 2021 at 11:34 AM Akshay Bhasin (BLOOMBERG/ 731 LEX) <
> abhasi...@bloomberg.net> wrote:
>
>> Hi Drill Community,
>>
>> I'm Akshay and I'm using Drill for a project I'm working on.
>>
>> There is this particular use case I want to implement - I want to know if
>> its possible.
>>
>> 1) Currently, we have a partition of file system and we create a view on
>> top of it. For example, we have below directory structure -
>>
>> /home/product/product_name/year/month/day/*parquet
>> /home/product/product_name_2/year/month/day/*parquet
>> /home/product/product_name_3/year/month/day/*parquetdev
>>
>> Now, we create a view over it -
>> Create view temp AS SELECT `dir0` AS prod, `dir1` as year, `dir2` as
>> month, `dir3` as day, * from dfs.`/home/product`;
>>
>> Then, we can query all the data dynamically -
>> SELECT * from temp LIMIT 5;
>>
>> 2) Now I want to replicate this behavior via s3. I want to ask if its
>> possible - I was able to create a logical directory. But s3 inherently does
>> not support directories only objects.
>>
>> Therefore, I was curious to know if it is supported/way to do this. I was
>> unable to find any documentation on your website related to partitioning
>> data on s3.
>>
>> Thanks for your help.
>> Best,
>> Akshay
>
>
>

Reply via email to