+1

> On Oct 30, 2019, at 2:36 PM, Costello, Roger L. <[email protected]> wrote:
> 
> Excellent! Okay, here’s the use case:
>  
> A Daffodil extension could be created for Apache Drill so that you could 
> parse any kind of data with Daffodil using a DFDL schema, and then you could 
> use ANSI SQL to query the data, join it with other data, do analysis, etc., 
> just as if it came from a database. So, instead of parsing data to XML and 
> then using XPath to pull out data, you could instead parse data to Apache 
> Drill's data representation and then use ANSI SQL to pull out data, and even 
> combine it with other non-Daffodil data types. The advantage for this would 
> be that it would make it very easy to enable Drill to query new data types 
> (IE simply by using a DFDL schema) and it would enable users to easily query 
> this data without having to load it into another system.
>  
> How’s that Charles?
>  
> /Roger
> From: Charles Givre <[email protected] <mailto:[email protected]>> 
> Sent: Wednesday, October 30, 2019 2:28 PM
> To: Costello, Roger L. <[email protected] <mailto:[email protected]>>
> Cc: [email protected] <mailto:[email protected]>
> Subject: [EXT] Re: Use cases for DFDL
>  
> Close... One minor nit is that Drill doesn't use a "query-like" syntax. It is 
> regular ANSI SQL.  IMHO, I think this. would be a really great collaboration 
> of the two communities.
> --C
>  
> 
> 
> On Oct 30, 2019, at 1:10 PM, Costello, Roger L. <[email protected] 
> <mailto:[email protected]>> wrote:
>  
> Thanks again Charles. Is the following use case description correct?
>  
> A Daffodil extension could be created for Apache Drill so that you could 
> parse any kind of data with Daffodil using a DFDL schema, and then you could 
> use Apache Drill's query-like syntax and rich capabilities to query parts of 
> that data, join it with other data, do analysis, etc., just as if it came 
> from a database. So, instead of parsing data to XML and then using XPath to 
> pull out data, you could instead parse data to Apache Drill's data 
> representation and then use Drills rich data-query capabilities to pull out 
> data, and even combine it with other non-Daffodil data types. The advantage 
> for this would be that it would make it very easy to enable Drill to query 
> new data types (IE simply by using a DFDL schema) and it would enable users 
> to easily query this data without having to load it into another system.
>  
> Is that correct?
>  
> /Roger
> From: Charles Givre <[email protected] <mailto:[email protected]>> 
> Sent: Wednesday, October 30, 2019 12:19 PM
> To: Costello, Roger L. <[email protected] <mailto:[email protected]>>
> Cc: [email protected] <mailto:[email protected]>
> Subject: [EXT] Re: Use cases for DFDL
>  
> Not exactly...
> I was thinking of using DFDL to enable Drill to create a schema for data that 
> Drill cannot read.  If DFDL can be used to describe the schema, a plugin 
> could be written for Drill that mirrors this schema and ultimately reads the 
> data files.  Drill wouldn't be populating any database, but rather directly 
> querying the data.
>  
> The advantage for this would be that it would make it very easy to enable 
> Drill to query new data types (IE simply by using a DFDL schema) and it would 
> enable users to easily query this data w/o having to load it into another 
> system.  Does that make sense?
> -- C
>  
>  
> On Oct 30, 2019, at 12:13 PM, Costello, Roger L. <[email protected] 
> <mailto:[email protected]>> wrote:
>  
> Thanks Charles. Let me see if I understand the use case correctly.
>  
> Use DFDL to parse data to populate a database and then use Apache Drill to 
> query the database.
>  
> Is that correct?
>  
> /Roger 
>  
> From: Charles Givre <[email protected] <mailto:[email protected]>> 
> Sent: Wednesday, October 30, 2019 12:01 PM
> To: [email protected] <mailto:[email protected]>
> Subject: [EXT] Re: Use cases for DFDL
>  
> To add to this discussion, I'm the PMC chair for Apache Drill.  I think a 
> compelling use case for DFDL would be enabling Drill to use DFDL to enable 
> Drill to query data based on a DFDL schema.  This same concept could be 
> applied to other SQL query engines such as Presto and/or Impala. 
>  
> IMHO, this would facilitate the analysis of data sets supported by DFDL. 
> -- C
> 
> 
> 
> 
> On Oct 30, 2019, at 11:53 AM, Costello, Roger L. <[email protected] 
> <mailto:[email protected]>> wrote:
>  
> Thanks Mike! I updated the slide:
>  
> <image002.png>
>  
> From: Beckerle, Mike <[email protected] <mailto:[email protected]>> 
> Sent: Wednesday, October 30, 2019 11:45 AM
> To: [email protected] <mailto:[email protected]>
> Subject: [EXT] Re: Use cases for DFDL
>  
> I would not pick on RDF data stores as the target.
>  
> Parsing data to populate a database (any variety) is the actual case. The 
> fact that we did do one project involving RDF is why I cited that example in 
> particular but pulling data into any data store/data base begins with the 
> ability to parse the data, and then process it into suitable form.
>  
> This is an incomplete list so perhaps this slide title should be "Example Use 
> Cases for DFDL" ?
>  
> ...mikeb
> From: Costello, Roger L. <[email protected] <mailto:[email protected]>>
> Sent: Monday, October 28, 2019 10:41 AM
> To: [email protected] <mailto:[email protected]> 
> <[email protected] <mailto:[email protected]>>
> Subject: Use cases for DFDL
>  
> Hi Folks,
>  
> I created a slide of use cases. See below. Do you agree with the slide? 
> Anything you would add, delete, or change?  /Roger
>  
> <image003.png>

Reply via email to