subject:"Re\: nested structs. querying and building metadata in calcite"

Re: nested structs. querying and building metadata in calcite

2018-08-14 Thread Vladimir Sitnikov

Stamatis>possibly many objects from various different classes are under a
single table

That is fine. For instance, if class ByteArrayInputStream extends
InputStream, then
select * from instanceof.java.io.InputStream; would produce both instances,
and the types of columns will be limited to the ones of InputStream

ClassLoaders break that (there might be multiple variations of InputStream
class with different fields), however I just ignore that for now.

Stamatis>Furthermore, the way you use the DOT notation in queries it mostly
means a
join rather than field access

That depends. It might make sense to apply query optimizer so it could
decide which one is better.

Stamatis>From a quick look at your project, it seems that this is what you
are doing
so I guess you are not relying on struct types after all.

It is "struct and egg" problem: Calcite lacked structs, so I used whatever
was implemented.


Re proxy types, it looks like "select *" is not defined there.
Consider


Julian> Since Calcite is grounded in SQL, I would encourage people to give
Julian> examples in terms of SQL (DDL, queries, results), not just in terms
of

type ListElement is record (
  value int,
  next ListElement
);

then one issues "select value, next from table_of_ListElements".
Of course "next" should not load all the chain. At the end of the day, it
could be even self-referencing chain (or a chain with 100500 elements)
So "next" field should behave more like a proxy to the actual data.

1) Validator should expect "next" to have the fields of ListElement, in
other words
"select value, value.next.next.value" should be valid while "select value,
next.abc" should be invalid
2) The contents of "next" should be represented as a "locator" or something
like that. That is at ResultSet processing site it could be something like

Map listElement = (Map) rs.getObject("next"); // <-- issues roundtrip
Map nextElement = (Map) listElement.get("next"); // <-- issues roundtrip
nextElement.get("value"); // <-- issues roundtrip

This approach looks to be statically typed, so it looks feasible to me.

Stamatis>Thus for the moment, I am not sure if the recursive struct
definition is a
must-have feature for Calcite

Of course it is not a must-have.
I just want to reconsider implementation of mat-calcite-plugin to use
structs or something like that so SQL queries can be written in a type-safe
manner.

Vladimir

Re: nested structs. querying and building metadata in calcite

2018-08-14 Thread Julian Hyde

Rather than relational vs. non-relational, I think a more interesting split is 
strongly-typed vs. weakly-typed (think java vs. javascript):
* SQL is strongly typed. You know that a result set is a multi-set, and that 
each element is a row, and the names/types of that row’s fields.
* Document stores are often weakly typed. You know that a result set will be a 
collection of JSON elements, but you don’t know what fields they will have.

Can we bring them together without doing too much harm to each? Yes, I believe 
we can, in just the same way that you can access JSON data in strongly-typed 
Java. In Java you write code like this:

void foo(Object json) {
  Map map = (Map) json;
  int orderId = (Integer) map.get(“orderId);
  List> lineItems = map.get(“lineItems”);
  for (Map lineItem: lineItems) {
String productName = lineItem.get(“productName”);
  }
}

In SQL, we have a MAP type for JSON objects, we have a MULTISET type for JSON 
lists, and an ITEM operator to get elements from MAPs and MULTISETs, and we 
have CAST. All the same tools as Java. (I guess we also need INSTANCEOF.) 

Julian


> On Aug 14, 2018, at 5:56 AM, Stamatis Zampetakis  wrote:
> 
> Andrei Sereda> I still have one question / clarification regarding
> semantics of select *
> for document databases (with declared schema).
> 
> SQL was conceived for relational data thus the semantics may be ambiguous
> for other kinds of data.
> Probably by now, there are works defining the semantics of SQL queries over
> document stores but I don't have in mind something in particular (or
> something that has been standardized).
> If somebody else has something in mind, I would be also interested in a
> reference.
> 
> Missing vs. null values is not a problem with SQL and relational data but
> rather with other kinds of data such as XML and JSON.
> If you map your documents to relational data (and SQL types) there
> shouldn't be any ambiguity at least at query time.
> 
> 
> 2018-08-14 15:38 GMT+03:00 Stamatis Zampetakis :
> 
>> @Vladimir
>> 
>> I don't have a concrete opinion on the proxy approach but rather some
>> general comments regarding the use-case you provided.
>> 
>> Basically, if I understood well you want to provide a query language for
>> analyzing heap dumps that is more expressive than OQL > java.net/~sundar/8022483/webrev.01/raw_files/new/src/
>> share/classes/com/sun/tools/hat/resources/oqlhelp.html>.
>> 
>> It seems that the challenge is how to represent Java classes as RDBMS
>> tables and in particular how to handle cases where there are recursive
>> definitions.
>> 
>> Using struct types seems nice in particular for querying but this means
>> that possibly many objects from various different classes are under a
>> single table.
>> In addition, the same object can be stored multiple times under different
>> tables.
>> 
>> class A {
>>  C fA;
>> }
>> 
>> class B {
>> C fB;
>> }
>> 
>> class C {
>> int fC;
>> }
>> 
>> When mapping class A to a relational table the question is what is the
>> data type of column fA and for that, I don't think a struct type C is a
>> good idea.
>> Structs are naturally used for composition while from your example we
>> mostly want an association.
>> 
>> Furthermore, the way you use the DOT notation in queries it mostly means a
>> join rather than field access.
>> Thus I would say that these kind of queries are closer to JPQL and OQL
>> than to pure SQL.
>> 
>> Since you want to use Calcite to analyze heap dumps it may be easier if
>> you stick to a more traditional RDBMS approach.
>> From a quick look at your project, it seems that this is what you are
>> doing so I guess you are not relying on struct types after all.
>> 
>> Thus for the moment, I am not sure if the recursive struct definition is a
>> must-have feature for Calcite.
>> 
>> 
>> 2018-08-14 0:09 GMT+03:00 Michael Mior :
>> 
>>> Returning a list of mytype seems preferable. As far as missing vs. null
>>> attributes, I think the distinction is commonly not that important and if
>>> it is, there should be some method of accessing the raw document as a
>>> fallback.
>>> 
>>> --
>>> Michael Mior
>>> mm...@apache.org
>>> 
>>> 
>>> 
>>> Le lun. 13 août 2018 à 16:40, Andrei Sereda  a écrit :
>>> 
 Thanks all for your answers.
 
 I still have one question / clarification regarding semantics of select
>>> *
 for document databases (with declared schema).
 
 Say I have the following type definition:
 
 CREATE TYPE mytype AS (
a varchar(2) not null,
b varchar(2) NULL // optional (null?)
 );
 
 If my document has only a present ({ a:value }) what should select *
>>> return
 ? Map with single value ({a:value}) or a pair (value, null)
 
 In other words should select * return raw document (as generic map) or
>>> list
 of mytype ?
 If later, how to differentiate between missing attribute (field) and
 attribute having null value ?
 
 On Sat, Aug 11,

Re: nested structs. querying and building metadata in calcite

2018-08-14 Thread Stamatis Zampetakis

Andrei Sereda> I still have one question / clarification regarding
semantics of select *
for document databases (with declared schema).

SQL was conceived for relational data thus the semantics may be ambiguous
for other kinds of data.
Probably by now, there are works defining the semantics of SQL queries over
document stores but I don't have in mind something in particular (or
something that has been standardized).
If somebody else has something in mind, I would be also interested in a
reference.

Missing vs. null values is not a problem with SQL and relational data but
rather with other kinds of data such as XML and JSON.
If you map your documents to relational data (and SQL types) there
shouldn't be any ambiguity at least at query time.


2018-08-14 15:38 GMT+03:00 Stamatis Zampetakis :

> @Vladimir
>
> I don't have a concrete opinion on the proxy approach but rather some
> general comments regarding the use-case you provided.
>
> Basically, if I understood well you want to provide a query language for
> analyzing heap dumps that is more expressive than OQL  java.net/~sundar/8022483/webrev.01/raw_files/new/src/
> share/classes/com/sun/tools/hat/resources/oqlhelp.html>.
>
> It seems that the challenge is how to represent Java classes as RDBMS
> tables and in particular how to handle cases where there are recursive
> definitions.
>
> Using struct types seems nice in particular for querying but this means
> that possibly many objects from various different classes are under a
> single table.
> In addition, the same object can be stored multiple times under different
> tables.
>
> class A {
>   C fA;
> }
>
> class B {
>  C fB;
> }
>
> class C {
>  int fC;
> }
>
> When mapping class A to a relational table the question is what is the
> data type of column fA and for that, I don't think a struct type C is a
> good idea.
> Structs are naturally used for composition while from your example we
> mostly want an association.
>
> Furthermore, the way you use the DOT notation in queries it mostly means a
> join rather than field access.
> Thus I would say that these kind of queries are closer to JPQL and OQL
> than to pure SQL.
>
> Since you want to use Calcite to analyze heap dumps it may be easier if
> you stick to a more traditional RDBMS approach.
> From a quick look at your project, it seems that this is what you are
> doing so I guess you are not relying on struct types after all.
>
> Thus for the moment, I am not sure if the recursive struct definition is a
> must-have feature for Calcite.
>
>
> 2018-08-14 0:09 GMT+03:00 Michael Mior :
>
>> Returning a list of mytype seems preferable. As far as missing vs. null
>> attributes, I think the distinction is commonly not that important and if
>> it is, there should be some method of accessing the raw document as a
>> fallback.
>>
>> --
>> Michael Mior
>> mm...@apache.org
>>
>>
>>
>> Le lun. 13 août 2018 à 16:40, Andrei Sereda  a écrit :
>>
>> > Thanks all for your answers.
>> >
>> > I still have one question / clarification regarding semantics of select
>> *
>> > for document databases (with declared schema).
>> >
>> > Say I have the following type definition:
>> >
>> > CREATE TYPE mytype AS (
>> > a varchar(2) not null,
>> > b varchar(2) NULL // optional (null?)
>> > );
>> >
>> > If my document has only a present ({ a:value }) what should select *
>> return
>> > ? Map with single value ({a:value}) or a pair (value, null)
>> >
>> > In other words should select * return raw document (as generic map) or
>> list
>> > of mytype ?
>> > If later, how to differentiate between missing attribute (field) and
>> > attribute having null value ?
>> >
>> > On Sat, Aug 11, 2018 at 2:51 PM Julian Hyde  wrote:
>> >
>> > > As I noted in https://issues.apache.org/jira/browse/CALCITE-2464 <
>> > > https://issues.apache.org/jira/browse/CALCITE-2464>, SQL struct
>> types do
>> > > not behave exactly like Java classes (more like Java value types). If
>> the
>> > > semantics are not as desired, maybe we’ll have to design a new type
>> > > constructor.
>> > >
>> > > Since Calcite is grounded in SQL, I would encourage people to give
>> > > examples in terms of SQL (DDL, queries, results), not just in terms of
>> > the
>> > > Java APIs.
>> > >
>> > > Lastly, I’ll draw your attention to Shuyi’s great work on “CREATE
>> TYPE”
>> > > (see https://issues.apache.org/jira/browse/CALCITE-2045 <
>> > > https://issues.apache.org/jira/browse/CALCITE-2045>). He extended
>> DDL in
>> > > the “server” module, so you can try out his examples.
>> > >
>> > > Julian
>> > >
>> > >
>> > > > On Aug 11, 2018, at 9:03 AM, Vladimir Sitnikov <
>> > > sitnikov.vladi...@gmail.com> wrote:
>> > > >
>> > > > Just to clarify the use case: I'm building SQL plugin to analyze
>> Java
>> > > heap
>> > > > dumps.
>> > > > https://github.com/vlsi/mat-calcite-plugin
>> > > >
>> > > > select * from "java.lang.String" s  produces a row for each String
>> in
>> > the
>> > > > heap dump.
>> > > >
>> >

Re: nested structs. querying and building metadata in calcite

2018-08-14 Thread Stamatis Zampetakis

@Vladimir

I don't have a concrete opinion on the proxy approach but rather some
general comments regarding the use-case you provided.

Basically, if I understood well you want to provide a query language for
analyzing heap dumps that is more expressive than OQL <
http://cr.openjdk.java.net/~sundar/8022483/webrev.01/raw_files/new/src/share/classes/com/sun/tools/hat/resources/oqlhelp.html
>.

It seems that the challenge is how to represent Java classes as RDBMS
tables and in particular how to handle cases where there are recursive
definitions.

Using struct types seems nice in particular for querying but this means
that possibly many objects from various different classes are under a
single table.
In addition, the same object can be stored multiple times under different
tables.

class A {
  C fA;
}

class B {
 C fB;
}

class C {
 int fC;
}

When mapping class A to a relational table the question is what is the data
type of column fA and for that, I don't think a struct type C is a good
idea.
Structs are naturally used for composition while from your example we
mostly want an association.

Furthermore, the way you use the DOT notation in queries it mostly means a
join rather than field access.
Thus I would say that these kind of queries are closer to JPQL and OQL than
to pure SQL.

Since you want to use Calcite to analyze heap dumps it may be easier if you
stick to a more traditional RDBMS approach.
>From a quick look at your project, it seems that this is what you are doing
so I guess you are not relying on struct types after all.

Thus for the moment, I am not sure if the recursive struct definition is a
must-have feature for Calcite.


2018-08-14 0:09 GMT+03:00 Michael Mior :

> Returning a list of mytype seems preferable. As far as missing vs. null
> attributes, I think the distinction is commonly not that important and if
> it is, there should be some method of accessing the raw document as a
> fallback.
>
> --
> Michael Mior
> mm...@apache.org
>
>
>
> Le lun. 13 août 2018 à 16:40, Andrei Sereda  a écrit :
>
> > Thanks all for your answers.
> >
> > I still have one question / clarification regarding semantics of select *
> > for document databases (with declared schema).
> >
> > Say I have the following type definition:
> >
> > CREATE TYPE mytype AS (
> > a varchar(2) not null,
> > b varchar(2) NULL // optional (null?)
> > );
> >
> > If my document has only a present ({ a:value }) what should select *
> return
> > ? Map with single value ({a:value}) or a pair (value, null)
> >
> > In other words should select * return raw document (as generic map) or
> list
> > of mytype ?
> > If later, how to differentiate between missing attribute (field) and
> > attribute having null value ?
> >
> > On Sat, Aug 11, 2018 at 2:51 PM Julian Hyde  wrote:
> >
> > > As I noted in https://issues.apache.org/jira/browse/CALCITE-2464 <
> > > https://issues.apache.org/jira/browse/CALCITE-2464>, SQL struct types
> do
> > > not behave exactly like Java classes (more like Java value types). If
> the
> > > semantics are not as desired, maybe we’ll have to design a new type
> > > constructor.
> > >
> > > Since Calcite is grounded in SQL, I would encourage people to give
> > > examples in terms of SQL (DDL, queries, results), not just in terms of
> > the
> > > Java APIs.
> > >
> > > Lastly, I’ll draw your attention to Shuyi’s great work on “CREATE TYPE”
> > > (see https://issues.apache.org/jira/browse/CALCITE-2045 <
> > > https://issues.apache.org/jira/browse/CALCITE-2045>). He extended DDL
> in
> > > the “server” module, so you can try out his examples.
> > >
> > > Julian
> > >
> > >
> > > > On Aug 11, 2018, at 9:03 AM, Vladimir Sitnikov <
> > > sitnikov.vladi...@gmail.com> wrote:
> > > >
> > > > Just to clarify the use case: I'm building SQL plugin to analyze Java
> > > heap
> > > > dumps.
> > > > https://github.com/vlsi/mat-calcite-plugin
> > > >
> > > > select * from "java.lang.String" s  produces a row for each String in
> > the
> > > > heap dump.
> > > >
> > > > Then might be a case like
> > > > select u.path from  "java.net.URL" u;
> > > > That is java.net.URL has "path" field which is of java.lang.String.
> > > >
> > > > Of course Java classes can produce recursive types, so Node { Node
> > next;
> > > }
> > > > bothered me.
> > > >
> > > > The relevant issue is
> > https://issues.apache.org/jira/browse/CALCITE-207
> > > >
> > > > I have asked once if RelDataTypeProxy is welcome in Calcite (
> > > >
> > >
> > https://issues.apache.org/jira/browse/CALCITE-207?
> focusedCommentId=14035245=com.atlassian.jira.
> plugin.system.issuetabpanels:comment-tabpanel#comment-14035245
> > > > ), however it looks like I have to implement it and see what breaks.
> > > >
> > > > The idea there was to use RelDataTypeProxy("Node") as a type for the
> > > "next"
> > > > field. That should avoid "stackoverflow" on cyclic dependencies in
> > types.
> > > > I would love to know your opinion on that if you happen to have one.
> > > >
> > > >

Re: nested structs. querying and building metadata in calcite

2018-08-13 Thread Michael Mior

Returning a list of mytype seems preferable. As far as missing vs. null
attributes, I think the distinction is commonly not that important and if
it is, there should be some method of accessing the raw document as a
fallback.

--
Michael Mior
mm...@apache.org



Le lun. 13 août 2018 à 16:40, Andrei Sereda  a écrit :

> Thanks all for your answers.
>
> I still have one question / clarification regarding semantics of select *
> for document databases (with declared schema).
>
> Say I have the following type definition:
>
> CREATE TYPE mytype AS (
> a varchar(2) not null,
> b varchar(2) NULL // optional (null?)
> );
>
> If my document has only a present ({ a:value }) what should select * return
> ? Map with single value ({a:value}) or a pair (value, null)
>
> In other words should select * return raw document (as generic map) or list
> of mytype ?
> If later, how to differentiate between missing attribute (field) and
> attribute having null value ?
>
> On Sat, Aug 11, 2018 at 2:51 PM Julian Hyde  wrote:
>
> > As I noted in https://issues.apache.org/jira/browse/CALCITE-2464 <
> > https://issues.apache.org/jira/browse/CALCITE-2464>, SQL struct types do
> > not behave exactly like Java classes (more like Java value types). If the
> > semantics are not as desired, maybe we’ll have to design a new type
> > constructor.
> >
> > Since Calcite is grounded in SQL, I would encourage people to give
> > examples in terms of SQL (DDL, queries, results), not just in terms of
> the
> > Java APIs.
> >
> > Lastly, I’ll draw your attention to Shuyi’s great work on “CREATE TYPE”
> > (see https://issues.apache.org/jira/browse/CALCITE-2045 <
> > https://issues.apache.org/jira/browse/CALCITE-2045>). He extended DDL in
> > the “server” module, so you can try out his examples.
> >
> > Julian
> >
> >
> > > On Aug 11, 2018, at 9:03 AM, Vladimir Sitnikov <
> > sitnikov.vladi...@gmail.com> wrote:
> > >
> > > Just to clarify the use case: I'm building SQL plugin to analyze Java
> > heap
> > > dumps.
> > > https://github.com/vlsi/mat-calcite-plugin
> > >
> > > select * from "java.lang.String" s  produces a row for each String in
> the
> > > heap dump.
> > >
> > > Then might be a case like
> > > select u.path from  "java.net.URL" u;
> > > That is java.net.URL has "path" field which is of java.lang.String.
> > >
> > > Of course Java classes can produce recursive types, so Node { Node
> next;
> > }
> > > bothered me.
> > >
> > > The relevant issue is
> https://issues.apache.org/jira/browse/CALCITE-207
> > >
> > > I have asked once if RelDataTypeProxy is welcome in Calcite (
> > >
> >
> https://issues.apache.org/jira/browse/CALCITE-207?focusedCommentId=14035245=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14035245
> > > ), however it looks like I have to implement it and see what breaks.
> > >
> > > The idea there was to use RelDataTypeProxy("Node") as a type for the
> > "next"
> > > field. That should avoid "stackoverflow" on cyclic dependencies in
> types.
> > > I would love to know your opinion on that if you happen to have one.
> > >
> > > It's great that you update executor to support nested structs.
> > >
> > > PS. I've not had a chance to review it.
> > >
> > > Vladimir
> >
> >
>

Re: nested structs. querying and building metadata in calcite

2018-08-13 Thread Andrei Sereda

Thanks all for your answers.

I still have one question / clarification regarding semantics of select *
for document databases (with declared schema).

Say I have the following type definition:

CREATE TYPE mytype AS (
a varchar(2) not null,
b varchar(2) NULL // optional (null?)
);

If my document has only a present ({ a:value }) what should select * return
? Map with single value ({a:value}) or a pair (value, null)

In other words should select * return raw document (as generic map) or list
of mytype ?
If later, how to differentiate between missing attribute (field) and
attribute having null value ?

On Sat, Aug 11, 2018 at 2:51 PM Julian Hyde  wrote:

> As I noted in https://issues.apache.org/jira/browse/CALCITE-2464 <
> https://issues.apache.org/jira/browse/CALCITE-2464>, SQL struct types do
> not behave exactly like Java classes (more like Java value types). If the
> semantics are not as desired, maybe we’ll have to design a new type
> constructor.
>
> Since Calcite is grounded in SQL, I would encourage people to give
> examples in terms of SQL (DDL, queries, results), not just in terms of the
> Java APIs.
>
> Lastly, I’ll draw your attention to Shuyi’s great work on “CREATE TYPE”
> (see https://issues.apache.org/jira/browse/CALCITE-2045 <
> https://issues.apache.org/jira/browse/CALCITE-2045>). He extended DDL in
> the “server” module, so you can try out his examples.
>
> Julian
>
>
> > On Aug 11, 2018, at 9:03 AM, Vladimir Sitnikov <
> sitnikov.vladi...@gmail.com> wrote:
> >
> > Just to clarify the use case: I'm building SQL plugin to analyze Java
> heap
> > dumps.
> > https://github.com/vlsi/mat-calcite-plugin
> >
> > select * from "java.lang.String" s  produces a row for each String in the
> > heap dump.
> >
> > Then might be a case like
> > select u.path from  "java.net.URL" u;
> > That is java.net.URL has "path" field which is of java.lang.String.
> >
> > Of course Java classes can produce recursive types, so Node { Node next;
> }
> > bothered me.
> >
> > The relevant issue is https://issues.apache.org/jira/browse/CALCITE-207
> >
> > I have asked once if RelDataTypeProxy is welcome in Calcite (
> >
> https://issues.apache.org/jira/browse/CALCITE-207?focusedCommentId=14035245=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14035245
> > ), however it looks like I have to implement it and see what breaks.
> >
> > The idea there was to use RelDataTypeProxy("Node") as a type for the
> "next"
> > field. That should avoid "stackoverflow" on cyclic dependencies in types.
> > I would love to know your opinion on that if you happen to have one.
> >
> > It's great that you update executor to support nested structs.
> >
> > PS. I've not had a chance to review it.
> >
> > Vladimir
>
>

Re: nested structs. querying and building metadata in calcite

2018-08-11 Thread Julian Hyde

As I noted in https://issues.apache.org/jira/browse/CALCITE-2464 
, SQL struct types do not 
behave exactly like Java classes (more like Java value types). If the semantics 
are not as desired, maybe we’ll have to design a new type constructor.

Since Calcite is grounded in SQL, I would encourage people to give examples in 
terms of SQL (DDL, queries, results), not just in terms of the Java APIs.

Lastly, I’ll draw your attention to Shuyi’s great work on “CREATE TYPE” (see 
https://issues.apache.org/jira/browse/CALCITE-2045 
). He extended DDL in the 
“server” module, so you can try out his examples.

Julian

> On Aug 11, 2018, at 9:03 AM, Vladimir Sitnikov  
> wrote:
> 
> Just to clarify the use case: I'm building SQL plugin to analyze Java heap
> dumps.
> https://github.com/vlsi/mat-calcite-plugin
> 
> select * from "java.lang.String" s  produces a row for each String in the
> heap dump.
> 
> Then might be a case like
> select u.path from  "java.net.URL" u;
> That is java.net.URL has "path" field which is of java.lang.String.
> 
> Of course Java classes can produce recursive types, so Node { Node next; }
> bothered me.
> 
> The relevant issue is https://issues.apache.org/jira/browse/CALCITE-207
> 
> I have asked once if RelDataTypeProxy is welcome in Calcite (
> https://issues.apache.org/jira/browse/CALCITE-207?focusedCommentId=14035245=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14035245
> ), however it looks like I have to implement it and see what breaks.
> 
> The idea there was to use RelDataTypeProxy("Node") as a type for the "next"
> field. That should avoid "stackoverflow" on cyclic dependencies in types.
> I would love to know your opinion on that if you happen to have one.
> 
> It's great that you update executor to support nested structs.
> 
> PS. I've not had a chance to review it.
> 
> Vladimir

Re: nested structs. querying and building metadata in calcite

2018-08-11 Thread Vladimir Sitnikov

Just to clarify the use case: I'm building SQL plugin to analyze Java heap
dumps.
https://github.com/vlsi/mat-calcite-plugin

select * from "java.lang.String" s  produces a row for each String in the
heap dump.

Then might be a case like
select u.path from  "java.net.URL" u;
That is java.net.URL has "path" field which is of java.lang.String.

Of course Java classes can produce recursive types, so Node { Node next; }
bothered me.

The relevant issue is https://issues.apache.org/jira/browse/CALCITE-207

I have asked once if RelDataTypeProxy is welcome in Calcite (
https://issues.apache.org/jira/browse/CALCITE-207?focusedCommentId=14035245=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14035245
), however it looks like I have to implement it and see what breaks.

The idea there was to use RelDataTypeProxy("Node") as a type for the "next"
field. That should avoid "stackoverflow" on cyclic dependencies in types.
I would love to know your opinion on that if you happen to have one.

It's great that you update executor to support nested structs.

PS. I've not had a chance to review it.

Vladimir

Re: nested structs. querying and building metadata in calcite

2018-08-11 Thread Stamatis Zampetakis

The struct type has to be fully qualified up front so recursive definitions
are not possible.

Node {
  Node left;
  Node right;
};

It may still be able to define tree-like structures similar to the second
example

Node {
  Object left; // it can be Node or other type
  Object right;
};

but it wouldn't be possible to write SQL queries exploiting them using the
DOT syntax.

I don't know what the standard says but from a quick test on Postgres
defining such structures is not possible either.



2018-08-11 15:15 GMT+03:00 Vladimir Sitnikov :

> Stamatis>They're already supported by the SQL parser and the relational
> algebra and
> Stamatis>soon () they
> will be
> Stamatis>supported also by the execution engine in the enumerable
> convention.
>
> Does it require struct type to be fully qualified upfront?
>
> For instance, can it support tree-like structures like the following?
>
> Node {
>   Node left;
>   Node right;
> };
>
> or something like
>
> Node {
>   Object left; // it can be Node or other type
>   Object right;
> };
>
> Vladimir
>

Re: nested structs. querying and building metadata in calcite

2018-08-11 Thread Vladimir Sitnikov

Stamatis>They're already supported by the SQL parser and the relational
algebra and
Stamatis>soon () they
will be
Stamatis>supported also by the execution engine in the enumerable
convention.

Does it require struct type to be fully qualified upfront?

For instance, can it support tree-like structures like the following?

Node {
  Node left;
  Node right;
};

or something like

Node {
  Object left; // it can be Node or other type
  Object right;
};

Vladimir

Re: nested structs. querying and building metadata in calcite

2018-08-11 Thread Stamatis Zampetakis

Hi Andrei,

Andrei>Are nested structs (generic JSON) fully supported in calcite ?

They're already supported by the SQL parser and the relational algebra and
soon () they will be
supported also by the execution engine in the enumerable convention.

Any level of struct is supported without any loss of type information. You
can create a RelDataType that is composed from other RelDataType using the
org.apache.calcite.rel.type.RelDataTypeFactory.Builder. In particular, for
nested types the method you are looking for is probably
org.apache.calcite.rel.type.RelDataTypeFactory.Builder#add(java.lang.String,
org.apache.calcite.rel.type.RelDataType)

Then you can query such kind of nested structures exactly as you wrote them
in your email. For more query examples you can have a look here:
<
https://github.com/zabetak/calcite/blob/871561e8c532661c42e5786d051e1f0018d34965/core/src/test/java/org/apache/calcite/test/ReflectiveSchemaTest.java#L356
>

Best,
Stamatis

On Sat, Aug 11, 2018, 9:37 AM Andrei Sereda  wrote:

> Hello Calcite Devs,
>
> I would like to ask your advice on the following use-case.
>
> Let’s say one wants to automatically construct (and then query) row type
> RelDataType out of existing model. In the case of elastic (but probably
> applicable to other databases as well) there is a notion of mappings
>  current/mapping.html>
> which
> is exposes schema of your documents.
> An over simplified version (of the schema) can look like below:
>
> {
>a: { type: boolean}
>b: {
>c: {type: long},
>d: {type: string}
>e: { f: {type: string}, g:{type: object} }
>}
> }
>
> Is my understanding correct that only top-level elements can
> be explicitly defined in a struct (a and b but not c or g) ?
>
> What about querying such structs. Will the following query work ?
>
> select * from elastic tbl where tbl.b.e.f = 'foo' and tbl.b.c = 42
>
> Looking at the code I have found two examples:
>
>1. _MAP which is a map type between VARCHAR and ANY. Access to fields is
>done using _MAP['b.e.f'] syntax (mongo and elastic adapter)
>2. Flat Structs which maps only top-level fields, nested ones (if they
>exists) are generic
>map types (geode adapter)
>
> The problem with _MAP (1) is that all type and schema information is lost
> and one needs
> to explicitly define a view to be able query a table.
>
> The inconvenience with “flat struct” (2) is that I can’t really query
> nested objects. Of course, one can flatten the structure like {a:string,
> b_e_f:string} but this is rather an ugly solution. Nor the following syntax
> is very elegant: tbl.b['e.f'].
>
> Are nested structs (generic JSON) fully supported in calcite ? If yes, can
> you point me in the right direction (RelDataType definion plus querying) ?
>
> Many Thanks,
> Andrei.
>

Re: nested structs. querying and building metadata in calcite

Re: nested structs. querying and building metadata in calcite

Re: nested structs. querying and building metadata in calcite

Re: nested structs. querying and building metadata in calcite

Re: nested structs. querying and building metadata in calcite

Re: nested structs. querying and building metadata in calcite

Re: nested structs. querying and building metadata in calcite

Re: nested structs. querying and building metadata in calcite

Re: nested structs. querying and building metadata in calcite

Re: nested structs. querying and building metadata in calcite

Re: nested structs. querying and building metadata in calcite

11 matches

Site Navigation

Mail list logo

Footer information