Re: Iceberg or deltalake table as input for drill queries

2021-07-06 Thread Christian Pfarr
Hi Charles,

i've opened an github issue.

https://github.com/apache/drill/issues/2269 


Hope this helps and would like to discuss the details with you.

Regards,
z0ltrix

‐‐‐ Original Message ‐‐‐

Charles Givre  schrieb am Dienstag, 6. Juli 2021 um 15:25:

> It sounds like there is interest in developing a storage plugin for Drill to 
> query Apache Iceberg. We've actually discussed this internally as well. We 
> could start looking into this.
> 

> -- C
> 

> > On Jul 5, 2021, at 11:05 AM, luoc l...@apache.org wrote:
> > 

> > Hi z0ltrix,
> > 

> > There are two links to contribute the ideas. [1] is the issues collector 
> > for the anything (Recommended). [2] is the guideline of contribution. Enjoy
> > 

> > [1] Github Issues https://github.com/apache/drill/issues
> > 

> > [2] Guideline Notice https://github.com/apache/drill/issues/2233
> > 

> > > 2021年7月5日 上午12:12,Christian Pfarr z0lt...@pm.me.INVALID 写道:
> > > 

> > > Hi luoc,
> > > 

> > > of course. I would be happy to support you with this.
> > > 

> > > How to start?
> > > 

> > > Regards,
> > > 

> > > z0ltrix
> > > 

> > >  Original-Nachricht 
> > > 

> > > Am 4. Juli 2021, 17:37, luoc schrieb:
> > > 

> > > Hi,
> > > 

> > > Makes perfect sense so far. Obviously, you understand the difference 
> > > between batch computation and Ad-Hoc. At the same time, Drill is a 
> > > high-performance MPP query layer for self describing data, schema-free 
> > > and ANSI SQL.
> > > 

> > > Would you mind helping me open an issue on the Github? Is a good way to 
> > > initiate the technical discussion.
> > > 

> > > > 在 2021年7月4日,02:54,Christian Pfarr z0lt...@pm.me.invalid 写道:
> > > > 

> > > > Hi luoc,
> > > > 

> > > > thanks for the information.
> > > > 

> > > > I think this kind of storage format is used more and more in cloud 
> > > > architectures because it departments wants to use as less tools as 
> > > > possible to provide a big data product. With iceberg they can build 
> > > > consistant and scalable big data structures for stream and batch 
> > > > processing at the same storage layer with a single tool, Spark.
> > > > 

> > > > The problem is how to provide the data to customers. In my opinion 
> > > > Spark itself is too slow for interactive querying by a lot of people or 
> > > > BI Tools. Thats the point where Tools like Presto, Drill or Dremio 
> > > > enters the stage.
> > > > 

> > > > I would like to see Drill as competitor in this area, especially 
> > > > because of the brilliant flexible and schemaless design.
> > > > 

> > > > If the Iceberg implementation is already done for metastore and you are 
> > > > already experienced with its internals, it sounds worth to invest the 
> > > > time and energy for a new format plugin.
> > > > 

> > > > Just the opinion of an consultant who wants to recommend drill for this 
> > > > usecases ;)
> > > > 

> > > > Regards
> > > > 

> > > > z0ltrix
> > > > 

> > > >  Original-Nachricht 
> > > > 

> > > > Am 3. Juli 2021, 16:55, luoc schrieb:
> > > > 

> > > > Hello,
> > > > 

> > > > Thanks for the interest. Drill’s Metastore allows to use a storage 
> > > > engine based on Iceberg tables. But now, It seems that Drill does not 
> > > > support the data of Iceberg for query. I will tell you that Drill can 
> > > > definitely support Iceberg, including readable and writeable. The 
> > > > condition is that we need to develop the format plugin using the "Easy 
> > > > framework based on EVF". Please let me know if you are interested in 
> > > > the that.
> > > > 

> > > > > 2021年7月3日 上午2:41,Christian Pfarr z0lt...@pm.me.INVALID 写道:
> > > > > 

> > > > > Hello everyone,
> > > > > 

> > > > > it looks like more and more people are using deltalake or iceberg in 
> > > > > spark for transactional working with big tables.
> > > > > 

> > > > > Additionally i saw that drill is using iceberg as storage engine for 
> > > > > metadata.
> > > > > 

> > > > > So, i wonder if its possible to query iceberg tables stored in hdfs 
> > > > > or s3 directly via drill so that i can process my data with spark 
> > > > > iceberg tables and present them with drill to my data scientists.
> > > > > 

> > > > > Regards,
> > > > > 

> > > > > z0ltrix
> > > > > 

> > > > > 
> > > > 

> > > > 
> > > 

> > > 

publickey - z0ltrix@pm.me - 0xF0E154C5.asc
Description: application/pgp-keys


signature.asc
Description: OpenPGP digital signature


Re: Iceberg or deltalake table as input for drill queries

2021-07-06 Thread Charles Givre
It sounds like there is interest in developing a storage plugin for Drill to 
query Apache Iceberg.  We've actually discussed this internally as well.  We 
could start looking into this.
-- C

> On Jul 5, 2021, at 11:05 AM, luoc  wrote:
> 
> Hi z0ltrix,
>  There are two links to contribute the ideas. [1] is the issues collector for 
> the anything (Recommended). [2] is the guideline of contribution. Enjoy
> 
> [1] Github Issues 
> [2] Guideline Notice 
> 
>> 2021年7月5日 上午12:12,Christian Pfarr  写道:
>> 
>> Hi luoc,
>> 
>> 
>> of course. I would be happy to support you with this.
>> 
>> How to start?
>> 
>> 
>> Regards,
>> 
>> z0ltrix
>> 
>> 
>> 
>> 
>> 
>> 
>>  Original-Nachricht 
>> Am 4. Juli 2021, 17:37, luoc schrieb:
>> 
>> 
>> Hi,
>> Makes perfect sense so far. Obviously, you understand the difference between 
>> batch computation and Ad-Hoc. At the same time, Drill is a high-performance 
>> MPP query layer for self describing data, schema-free and ANSI SQL.
>> Would you mind helping me open an issue on the Github? Is a good way to 
>> initiate the technical discussion.
>> 
>>> 在 2021年7月4日,02:54,Christian Pfarr  写道:
>>> Hi luoc,
>>> 
>>> 
>>> thanks for the information.
>>> 
>>> 
>>> I think this kind of storage format is used more and more in cloud 
>>> architectures because it departments wants to use as less tools as possible 
>>> to provide a big data product. With iceberg they can build consistant and 
>>> scalable big data structures for stream and batch processing at the same 
>>> storage layer with a single tool, Spark.
>>> 
>>> 
>>> The problem is how to provide the data to customers. In my opinion Spark 
>>> itself is too slow for interactive querying by a lot of people or BI Tools. 
>>> Thats the point where Tools like Presto, Drill or Dremio enters the stage.
>>> 
>>> 
>>> I would like to see Drill as competitor in this area, especially because of 
>>> the brilliant flexible and schemaless design.
>>> 
>>> 
>>> If the Iceberg implementation is already done for metastore and you are 
>>> already experienced with its internals, it sounds worth to invest the time 
>>> and energy for a new format plugin.
>>> 
>>> 
>>> Just the opinion of an consultant who wants to recommend drill for this 
>>> usecases ;)
>>> 
>>> 
>>> Regards
>>> 
>>> z0ltrix
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>>  Original-Nachricht 
>>> Am 3. Juli 2021, 16:55, luoc schrieb:
>>> 
>>> Hello,
>>> Thanks for the interest. Drill’s Metastore allows to use a storage engine 
>>> based on Iceberg tables. But now, It seems that Drill does not support the 
>>> data of Iceberg for query. I will tell you that Drill can definitely 
>>> support Iceberg, including readable and writeable. The condition is that we 
>>> need to develop the format plugin using the "Easy framework based on EVF". 
>>> Please let me know if you are interested in the that.
>>> 
 2021年7月3日 上午2:41,Christian Pfarr  写道:
 
 Hello everyone,
 
 
 it looks like more and more people are using deltalake or iceberg in spark 
 for transactional working with big tables.
 
 
 Additionally i saw that drill is using iceberg as storage engine for 
 metadata.
 
 
 So, i wonder if its possible to query iceberg tables stored in hdfs or s3 
 directly via drill so that i can process my data with spark iceberg tables 
 and present them with drill to my data scientists.
 
 
 Regards,
 
 z0ltrix
 
 
 
 
 
 
 
>>> 
>>> 
>> 
>> 
> 



Re: Drill on AIX

2021-07-06 Thread Prabhakar Bhosaale
Thanks Charles for your quick response.

Regards
Prabhakar

On Tue, Jul 6, 2021 at 5:24 PM Charles Givre  wrote:

> Hi Prabhakar,
> To the best of my knowledge, Drill is not currently certified on IBM AIX.
> I'm sure the community would welcome the opportunity to get it certified.
> -- C
>
>
> > On Jul 6, 2021, at 6:20 AM, Prabhakar Bhosaale 
> wrote:
> >
> > Hi Team,
> > Is Dril version 1.16 and onwards tested and certified on IBM AIX? thx
> >
> > Regards
> > Prabhakar
>
>


Re: Drill on AIX

2021-07-06 Thread Charles Givre
Hi Prabhakar, 
To the best of my knowledge, Drill is not currently certified on IBM AIX.  I'm 
sure the community would welcome the opportunity to get it certified.
-- C


> On Jul 6, 2021, at 6:20 AM, Prabhakar Bhosaale  wrote:
> 
> Hi Team,
> Is Dril version 1.16 and onwards tested and certified on IBM AIX? thx
> 
> Regards
> Prabhakar



Drill on AIX

2021-07-06 Thread Prabhakar Bhosaale
Hi Team,
Is Dril version 1.16 and onwards tested and certified on IBM AIX? thx

Regards
Prabhakar