Re: Running Calcite integration tests in docker

2018-05-02 Thread Christian Tzolov
Hi Francis,

Regarding Geode, initially i have tried to to ingest the test data via the
Geode's REST/JSON endpoint but bumped into this bug:
https://issues.apache.org/jira/browse/GEODE-3971 (still unresolved).

As consequence i had to write a custom ingestion using Geode Java API.  But
since i had to compile and run this tool, it might no much sense to deploy
and maintain the geode cluster via scripts (or docker image).  Instead it
is simpler to embed the entire Geode cluster into the same (standalone)
application. Because this is a Spring Boot app if we deploy the pre-build
jar to a public maven repo we can easy create a Docker image that runs it
in one line (e.g. java -jar ./)

Any ideas where we can host this project and where to release it?

Cheers,
Christian


On 23 April 2018 at 14:12, Francis Chuang  wrote:

> Thanks, Michael!
>
> I noticed that I forgot the link to my fork in my original message. Here
> is my fork if someone wants to hack on it a bit more:
> https://github.com/Boostport/calcite-test-dataset/tree/switch-to-docker
>
>
> On 23/04/2018 9:58 PM, Michael Mior wrote:
>
>> Thanks for raising this Francis. I was hoping to find more time to spend
>> on
>> this but unfortunately that hasn't happened.
>>
>> 1. That's a question for Christian Tzolov. I'm not too familiar with
>> Geode.
>> 2. You are correct that the VM contains several different database servers
>> with various ports exposed. I'm not sure what the situation is with
>> HSQLDB/H2.
>> 3. Maven is definitely not strictly necessary although some of the
>> dependencies currently pull in datasets that are used for some of the DBs
>> before building the VM.
>> 4.  I don't really have a strong preference either way. I'm sure someone
>> else can speak to why this was separated in the first place.
>>
>>
>> --
>> Michael Mior
>> mm...@uwaterloo.ca
>>
>> 2018-04-23 7:11 GMT-04:00 Francis Chuang :
>>
>> There is currently an issue open for this in the calcite-test-dataset
>>> repository[1], however, I would like to hear more from the wider
>>> community
>>> regarding this.
>>>
>>> I have created a `switch-to-docker` branch on my fork and committed a
>>> docker-compose.yml under the docker folder, but ran into a few roadblocks
>>> and didn't have any more time to investigate.
>>>
>>> I am currently investigating using docker-composer to orchestrate and set
>>> up the containers.
>>>
>>> Questions:
>>>
>>> 1. I am not very familiar with Apache Geode. I was able to start the
>>> server and locator using the official docker image, but there does not
>>> appear to be anyway to import data. In the current repository, there's
>>> some
>>> java code in `geode-standalone-cluster`. Why do/did we need to write Java
>>> code to stand up a geode cluster? Does anyone know if there are any
>>> standalone tools (preferably something with built binaries) that we can
>>> use
>>> to directly ingest the JSON data?
>>>
>>> 2. From my reading of the integration test instructions[2], the
>>> calcite-test-dataset spins up a VM with databases preloaded with data
>>> which
>>> the main calcite repository runs tests against. HSQLDB and H2 does not
>>> have
>>> any open ports in the VM that's spun up. How does does Calcite run tests
>>> against HSQLDB and H2?
>>>
>>> 3. What is the role of maven in the calcite-test-dataset repository? I
>>> see
>>> a lot of POMs in various subfolders such as mysql, postgresql, etc.
>>> However, I am not sure what these do. If maven is used to spin up the VM,
>>> perhaps we could remove the dependency on it and just run a
>>> `docker-compose
>>> up` to start the network of containers.
>>>
>>> 4. Is there any interest in bringing the contents of calcite-test-dataset
>>> directly into the Calcite repo? The repo zips up to 1.5MB, so it might
>>> not
>>> bring to much bloat to the Calcite repo.
>>>
>>> Francis
>>>
>>> [1] https://github.com/vlsi/calcite-test-dataset/issues/8
>>>
>>> [2] https://calcite.apache.org/docs/howto.html#running-integration-tests
>>>
>>>
>>>
>


-- 
Christian Tzolov <http://www.linkedin.com/in/tzolov> | Principle Software
Engineer | Spring.io <https://spring.io/> | Pivotal <http://pivotal.io/> |
ctzo...@pivotal.io


Re: Calcite class diagram

2018-03-19 Thread Christian Tzolov
Muhammad,

You can find couple of UML class diagrams in my apache con slides here:
https://speakerdeck.com/tzolov/using-apache-calcite-for-enabling-sql-and-jdbc-access-to-apache-geode-and-other-nosql

Check pages; 28, 29, 31, 32 and most immortally page 33 ;)

Cheers,
Christian

On 18 March 2018 at 20:51, Edmon Begoli  wrote:

> It is in a paper that went for SIGMOD this year:
> https://arxiv.org/abs/1802.10233
>
> Keep in mind, it is not much, but if you read the paper you will get a
> pretty decent idea at the high-to-mid level of what is going on.
>
> On Sun, Mar 18, 2018 at 3:49 PM, Muhammad Gelbana 
> wrote:
>
> > Is this UML shared somewhere I can access ? Would you please share it ?
> Any
> > insight into Calcit's structure may make all the difference for me.
> >
> > Thanks,
> > Gelbana
> >
> > On Sun, Mar 18, 2018 at 9:24 PM, Edmon Begoli  wrote:
> >
> > > This realistic for any real-life frameworks. UML breaks down very
> quickly
> > > when applied to a real software - both generation, and maintenance.
> > >
> > > We have UML for some high level concepts such as adapters, etc., but it
> > is
> > > conceptual, not 1:1 mapped class diagrams.
> > >
> > >
> > > On Sun, Mar 18, 2018 at 3:22 PM, Muhammad Gelbana  >
> > > wrote:
> > >
> > > > Has anyone generated a decent class diagram representing Calcite's
> core
> > > > classes and their relations ?
> > > >
> > > > I've just tried to create one using *code2uml* and in a few minutes
> it
> > > has
> > > > generated a 5.9 GB image file ! And it wasn't 10% complete !
> > > >
> > > > Would you kindly share if you have anything of sort ?
> > > >
> > > > Thanks,
> > > > Gelbana
> > > >
> > >
> >
>



-- 
Christian Tzolov <http://www.linkedin.com/in/tzolov> | Principle Software
Engineer | Spring.io <https://spring.io/> | Pivotal <http://pivotal.io/> |
ctzo...@pivotal.io


GroupBy for Indefinite SQL Streams

2017-12-09 Thread Christian Tzolov
Following the CsvStreamingTableXXX and StreamTest.java examples i've
implemented a naive Geode "streaming" adapter prototype [1].

If i understand it correct the sql-stream implementation for querying
"indefinite" tables, returns an "indefinite" ResultSet, so one can keep
calling the ResultSet.next for next result produced.

This logic works for non aggregation stream queries where every input in
the table triggers a new result in the ResultSet. For example the following
queries yield the expected result:

SELECT STREAM FLOOR(rowtime TO MINUTE) AS rowtime FROM BookMasterStream

For streaming aggregation (group-by) queries though the
GeodeStreamEnumerator's current and moveNext methods are called
indefinitely but no result is ever produced. It seems that the execution
never gets out of this loop: the
https://github.com/apache/calcite/blob/master/linq4j/src/main/java/org/apache/calcite/linq4j/EnumerableDefaults.java#L825

In a way this is expected given the indefinite nature of the input table.
Therefore I don't understand why or how this seems to work in the
StreamTest.java#148 tests?
But i'm almost certain that it will not work if you use aggregation instead
of simple scan in the CsvTest.java#758 example.

I am really stuck here and would appreciate an advise.

Thanks,
Christian

[1]
https://github.com/tzolov/calcite/tree/geode-1.3-stream/geode/src/main/java/org/apache/calcite/adapter/geode/stream




-- 
Christian Tzolov <http://www.linkedin.com/in/tzolov> | Principle Software
Engineer | Spring <https://spring.io/>.io | Pivotal <http://pivotal.io/> |
ctzo...@pivotal.io


Re: About the Spatial functionality

2017-12-07 Thread Christian Tzolov
Hi Vamshi,

Have you set the conformance to such that supports Geometry? i've been
using lenient like this: jdbc:calcite:conformance=LENIENT;
​model=...my model​
​
​
​

On 7 December 2017 at 13:53, Vamshi Krishna 
wrote:

> Hello Team,
>
> I have tried to use these functions by adding it to my schema as given
> below:
>
> ModelHandler.addFunctions(rootSchema, null, ImmutableList.of(),
> GeoFunctions.class.getName(), "*", true);
>
> but i run into an validation issue when calling the planner's validate
> routine with the below error:
> No match found for function signature ST_MAKEPOINT(, ,
> )
>
> Do we have to register these functions manually similar to
> OracleSqlOperatorTable or is there another way out for this ?
>
>
> Thanks,
> Vamshi.
>
>
>
>
> On Sat, Dec 2, 2017 at 4:33 AM, Christian Tzolov 
> wrote:
>
> > @Michael, sure go ahead and use the query if you find it fit. I am
> looking
> > forward to read this paper! If you need an "external" opinion about how
> > Calcite fits in the broader  data management ecosystem or "patterns" for
> > building Calcite adapters i can share few ideas ;)
> >
> > @Julian, i'm aware and agree with the approach for providing spatial
> > support. Currently I'm only trying to make it work (somehow) in the
> context
> > of the Geode adapter! My goal is to mention it during my talk at Apache
> > Geode Summit [1] on Monday.
> >
> > While on the topic of my talk [1], i've been looking for tempting reasons
> > to engage/involve the Geode community with the project or at least start
> > the argument. Here are my slides [2] (for internal use only until
> Monday).
> > In the pros and cons section i've shared some ideas:
> >
> > 1. In the context of data exploration and cleansing, an obvious advantage
> > is the easiness for integration with 3rd party DMS tools and
> >
> > 2. In the same context the easy to correlate Geode data with data from
> > multiple Sql and NoSql data stores (e.g. Data Federation).
> >
> > 3. But given that Geode is primarily used for OLTP-ish (e.g.
> > transactional)  workloads, even the OQL is considered a second-class
> > citizen. Therefore i find the concept of "SQL Stream" quite relevant and
> i
> > expect that the Geode community will find it interesting too. Geode
> already
> > provides limited CQ (Continues Querying) functionality and IMO the "Sql
> > Stream" will be like advanced CQ++. Unfortunately i haven't had time to
> > build and prototype in the context of Geode. Can you point me to some
> code
> > examples? I know about the Tests but are there and actual adapters or
> other
> > applications that use the "SQL Stream"?
> >
> > 4. @Julian you have mentioned the idea about dynamic materializations.
> But
> > i'm not sure i completely understand the approach and that confident to
> > bring it for discussion. If you have some written references that can
> help
> > me i will appreciate it.
> >
> > Cheers,
> > Christian
> >
> >
> > [1] Enable SQL/JDBC Access to Apache Geode/GemFire Using Apache Calcite:
> > https://springoneplatform.io/sessions/enable-sql-jdbc-
> > access-to-apache-geode-gemfire-using-apache-calcite
> > [2] Slides:
> > https://docs.google.com/presentation/d/1zo473pcupWEjRXOA_
> > W5rgaKSmS2Vmyl2U2ATKmrS26M/edit?usp=sharing
> >
> >
> > On 1 December 2017 at 21:05, Julian Hyde  wrote:
> >
> > > The Natural earth dataset (which https://github.com/zzolo/geo_
> > > simple_countries_wkt_csv <https://github.com/zzolo/geo_
> > > simple_countries_wkt_csv> is based upon) is Public Domain, which makes
> it
> > > suitable for our purposes.
> > >
> > > > Although inefficient (all spatial computations happen on calcite
> side)
> > > it is still very cool! :)
> > >
> > > That’s exactly what I was going for. First make it work (by adding all
> > > OpenGIS functions as UDFs), then make it fast (by adding rewrite rules
> > that
> > > recognize functions and particular patterns of materialized views).
> > >
> > > Your query is a spatial join of polygons (cities) to polygons
> > (countries).
> > > I have in mind a materialized view where polygons are sliced into
> > bounding
> > > “tiles” and I think it should speed up this kind of query.
> > >
> > > Julian
> > >
> > >
> > > > On Dec 1, 2017, at 9:09 AM, Christian Tzolov 
> > wrote:
&g

Re: "Streaming aggregation requires at least one monotonic expression"

2017-12-04 Thread Christian Tzolov
Thanks Julian, this could have been the problem.

I have another question? How the streaming (e.g. continuous query) is
supposed to be consumed?  Given that the enumerator never ends unless
canceled?

For example I am running the following query through sqllite:

SELECT STREAM
  FLOOR("rowtime" TO SECOND) AS "rowtime2",
  "itemNumber",
  SUM("retailCost") AS "retailCost2"
FROM "BookMaster"
GROUP BY FLOOR("rowtime" TO SECOND), "itemNumber";

It doesn't emit any errors but doesn't compute/show any results either? (i
am emitting new data in the mean time)




On 4 December 2017 at 05:05, Julian Hyde  wrote:

> If you change
>
> >  AS "rowtime”,
>
> to
>
> > AS "rowtime2",
>
> is the behavior any different? I am concerned that the “rowtime” column is
> visible in the GROUP BY clause. (It shouldn’t be, but bugs happen.)
>
>
> > On Dec 3, 2017, at 3:04 PM, Christian Tzolov  wrote:
> >
> > I have implemented a StreamingTable with monotonic
> > ​ ​
> > rowtime
> > ​ ​
> > column:
> >
> > 1: jdbc:calcite:model=/Users/ctzolov/Dev/proj> !columns "BookMaster
> > ​"​
> >
> > +---+-++-+--
> -+---+-+
> > | TABLE_CAT | TABLE_SCHEM | TABLE_NAME | COLUMN_NAME | DATA_TYPE |
> > TYPE_NAME | |
> > +---+-++-+--
> -+---+-+
> > *|   | bookshopstream | BookMaster | rowtime | 93|
> > TIMESTAM |*
> > |   | bookshopstream | BookMaster | itemNumber  | 4 |
> > JavaType |
> > |   | bookshopstream | BookMaster | description | 12|
> > JavaType |
> > |   | bookshopstream | BookMaster | retailCost  | 8 |
> > JavaType |
> > |   | bookshopstream | BookMaster | yearPublished | 4 |
> > JavaTy |
> > |   | bookshopstream | BookMaster | author  | 12|
> > JavaType |
> > |   | bookshopstream | BookMaster | title   | 12|
> > JavaType |
> > +---+-++-+--
> -+---+-+
> >
> > But when i try
> > ​a ​
> > streaming query like this:
> >
> > SELECT STREAM
> >  FLOOR("rowtime" TO HOUR) AS "rowtime",
> >  SUM("retailCost") AS "retailCost"
> > FROM "BookMaster"
> > GROUP BY FLOOR("rowtime" TO HOUR)
> >
> > It
> > ​ complains of a missing monotonic expression: ​
> >
> >
> > From line 5, column 1 to line 5, column 33: Streaming aggregation
> requires
> > at least one monotonic expression in GROUP BY clause (state=,code=0)
> >
> > ​What is missing? ​
> >
> > ​Thanks​
> >
> > --
> > Christian Tzolov <http://www.linkedin.com/in/tzolov> | Principle
> Software
> > Engineer | Spring <https://spring.io/>.io | Pivotal <http://pivotal.io/>
> |
> > ctzo...@pivotal.io
>
>


-- 
Christian Tzolov <http://www.linkedin.com/in/tzolov> | Principle Software
Engineer | Spring <https://spring.io/>.io | Pivotal <http://pivotal.io/> |
ctzo...@pivotal.io


Re: query about avatica odbc

2017-12-04 Thread Christian Tzolov
Avatica ODBC driver would be handy (and good selling i guess) feature
indeed!

@Josh, the documentation mentions that Apache Phoenix have ODBC
implementation over the avatica protocol but i couldn't find any code or
even further info? I guess it is a proprietary/commercial project?

Last week i've troubled myself trying to setup on my MacOS the OpenLink
ODBC-to-JDBC bridge (http://bit.ly/2AWpbhU) for calcite jdbc. It was a
complete failure and in the process I almost scrutinized my environment.
The bridge is monstrous approach to start with but just out of curiosity,
has anyone every succeeded plugging avatica jdbc in such odbc-jdbc bridge?



On 4 December 2017 at 06:24, Josh Elser  wrote:

> Hi Victor,
>
> An ODBC driver for Avatica would be a great addition to the project, but I
> am not aware of anyone who has volunteered to take on this development
> effort.
>
> On 12/1/17 9:03 PM, victor wrote:
>
>> Hi, calcite development team:
>>  we now use calcite for some databases development ,  the calcite
>> gives us a lot of convenience, thanks all of you. but it seems there has
>> not an odbc driver for avatica,  do you have some plan for odbc driver
>> developing, please let me know as soon as possible,
>> thanks a lot,
>>
>>
>> yours  best regards
>>
>>
>> victor lv
>>
>>


-- 
Christian Tzolov <http://www.linkedin.com/in/tzolov> | Principle Software
Engineer | Spring <https://spring.io/>.io | Pivotal <http://pivotal.io/> |
ctzo...@pivotal.io


Re: "Streaming aggregation requires at least one monotonic expression"

2017-12-03 Thread Christian Tzolov
a step further!

Although, not documented
​it
 appears that the streamable table must implement the getStatistics()
method even if it returns empty sets. Otherwise the
​m​
onotonicity
​ computation in
ItentifierNamespace#getMonotonicity
​ ​
will always return
​ ​
SqlMonotonicity.NOT_MONOTONIC
​ causing the previous error.

Now i'm facing another issue "java.lang.ClassCastException:
java.lang.String cannot be cast to java.lang.Double"
​, that appears to be
 related to
​ the​
"GROUP BY FLOOR(\"rowtime\" TO HOUR)
​ expression.

I will continue exploring but please share if there are some other
(undocumented) insides i need to set.

Thanks,
Christian

​

On 3 December 2017 at 23:04, Christian Tzolov  wrote:

>
> I have implemented a StreamingTable with monotonic
> ​ ​
> rowtime
> ​ ​
> column:
>
> 1: jdbc:calcite:model=/Users/ctzolov/Dev/proj> !columns "BookMaster
> ​"​
>
> +---+-++-+--
> -+---+-+
> | TABLE_CAT | TABLE_SCHEM | TABLE_NAME | COLUMN_NAME | DATA_TYPE |
> TYPE_NAME | |
> +---+-++-+--
> -+---+-+
> *|   | bookshopstream | BookMaster | rowtime | 93|
> TIMESTAM |*
> |   | bookshopstream | BookMaster | itemNumber  | 4 |
> JavaType |
> |   | bookshopstream | BookMaster | description | 12|
> JavaType |
> |   | bookshopstream | BookMaster | retailCost  | 8 |
> JavaType |
> |   | bookshopstream | BookMaster | yearPublished | 4 |
> JavaTy |
> |   | bookshopstream | BookMaster | author  | 12|
> JavaType |
> |   | bookshopstream | BookMaster | title   | 12|
> JavaType |
> +---+-++-+--
> -+---+-+
>
> But when i try
> ​a ​
> streaming query like this:
>
> SELECT STREAM
>   FLOOR("rowtime" TO HOUR) AS "rowtime",
>   SUM("retailCost") AS "retailCost"
> FROM "BookMaster"
> GROUP BY FLOOR("rowtime" TO HOUR)
>
> It
> ​ complains of a missing monotonic expression: ​
>
>
> From line 5, column 1 to line 5, column 33: Streaming aggregation requires
> at least one monotonic expression in GROUP BY clause (state=,code=0)
>
> ​What is missing? ​
>
> ​Thanks​
>
> --
> Christian Tzolov <http://www.linkedin.com/in/tzolov> | Principle Software
> Engineer | Spring <https://spring.io/>.io | Pivotal <http://pivotal.io/>
> | ctzo...@pivotal.io
>



-- 
Christian Tzolov <http://www.linkedin.com/in/tzolov> | Principle Software
Engineer | Spring <https://spring.io/>.io | Pivotal <http://pivotal.io/> |
ctzo...@pivotal.io


"Streaming aggregation requires at least one monotonic expression"

2017-12-03 Thread Christian Tzolov
I have implemented a StreamingTable with monotonic
​ ​
rowtime
​ ​
column:

1: jdbc:calcite:model=/Users/ctzolov/Dev/proj> !columns "BookMaster
​"​

+---+-++-+---+---+-+
| TABLE_CAT | TABLE_SCHEM | TABLE_NAME | COLUMN_NAME | DATA_TYPE |
TYPE_NAME | |
+---+-++-+---+---+-+
*|   | bookshopstream | BookMaster | rowtime | 93|
TIMESTAM |*
|   | bookshopstream | BookMaster | itemNumber  | 4 |
JavaType |
|   | bookshopstream | BookMaster | description | 12|
JavaType |
|   | bookshopstream | BookMaster | retailCost  | 8 |
JavaType |
|   | bookshopstream | BookMaster | yearPublished | 4 |
JavaTy |
|   | bookshopstream | BookMaster | author  | 12|
JavaType |
|   | bookshopstream | BookMaster | title   | 12|
JavaType |
+---+-++-+---+---+-+

But when i try
​a ​
streaming query like this:

SELECT STREAM
  FLOOR("rowtime" TO HOUR) AS "rowtime",
  SUM("retailCost") AS "retailCost"
FROM "BookMaster"
GROUP BY FLOOR("rowtime" TO HOUR)

It
​ complains of a missing monotonic expression: ​


>From line 5, column 1 to line 5, column 33: Streaming aggregation requires
at least one monotonic expression in GROUP BY clause (state=,code=0)

​What is missing? ​

​Thanks​

-- 
Christian Tzolov <http://www.linkedin.com/in/tzolov> | Principle Software
Engineer | Spring <https://spring.io/>.io | Pivotal <http://pivotal.io/> |
ctzo...@pivotal.io


Re: About the Spatial functionality

2017-12-02 Thread Christian Tzolov
@Michael, sure go ahead and use the query if you find it fit. I am looking
forward to read this paper! If you need an "external" opinion about how
Calcite fits in the broader  data management ecosystem or "patterns" for
building Calcite adapters i can share few ideas ;)

@Julian, i'm aware and agree with the approach for providing spatial
support. Currently I'm only trying to make it work (somehow) in the context
of the Geode adapter! My goal is to mention it during my talk at Apache
Geode Summit [1] on Monday.

While on the topic of my talk [1], i've been looking for tempting reasons
to engage/involve the Geode community with the project or at least start
the argument. Here are my slides [2] (for internal use only until Monday).
In the pros and cons section i've shared some ideas:

1. In the context of data exploration and cleansing, an obvious advantage
is the easiness for integration with 3rd party DMS tools and

2. In the same context the easy to correlate Geode data with data from
multiple Sql and NoSql data stores (e.g. Data Federation).

3. But given that Geode is primarily used for OLTP-ish (e.g.
transactional)  workloads, even the OQL is considered a second-class
citizen. Therefore i find the concept of "SQL Stream" quite relevant and i
expect that the Geode community will find it interesting too. Geode already
provides limited CQ (Continues Querying) functionality and IMO the "Sql
Stream" will be like advanced CQ++. Unfortunately i haven't had time to
build and prototype in the context of Geode. Can you point me to some code
examples? I know about the Tests but are there and actual adapters or other
applications that use the "SQL Stream"?

4. @Julian you have mentioned the idea about dynamic materializations.  But
i'm not sure i completely understand the approach and that confident to
bring it for discussion. If you have some written references that can help
me i will appreciate it.

Cheers,
Christian


[1] Enable SQL/JDBC Access to Apache Geode/GemFire Using Apache Calcite:
https://springoneplatform.io/sessions/enable-sql-jdbc-access-to-apache-geode-gemfire-using-apache-calcite
[2] Slides:
https://docs.google.com/presentation/d/1zo473pcupWEjRXOA_W5rgaKSmS2Vmyl2U2ATKmrS26M/edit?usp=sharing


On 1 December 2017 at 21:05, Julian Hyde  wrote:

> The Natural earth dataset (which https://github.com/zzolo/geo_
> simple_countries_wkt_csv <https://github.com/zzolo/geo_
> simple_countries_wkt_csv> is based upon) is Public Domain, which makes it
> suitable for our purposes.
>
> > Although inefficient (all spatial computations happen on calcite side)
> it is still very cool! :)
>
> That’s exactly what I was going for. First make it work (by adding all
> OpenGIS functions as UDFs), then make it fast (by adding rewrite rules that
> recognize functions and particular patterns of materialized views).
>
> Your query is a spatial join of polygons (cities) to polygons (countries).
> I have in mind a materialized view where polygons are sliced into bounding
> “tiles” and I think it should speed up this kind of query.
>
> Julian
>
>
> > On Dec 1, 2017, at 9:09 AM, Christian Tzolov  wrote:
> >
> > The OpenGIS Spec datasets sounds like right, "canonical" spatial dataset
> to have.
> >
> > In the meantime for the purposes of my tests i found a dataset (
> https://github.com/zzolo/geo_simple_countries_wkt_csv <
> https://github.com/zzolo/geo_simple_countries_wkt_csv>) that contains
> world country boundaries as WKT polygons along with their names, ISO
> abbreviations  and other metadata. I've also converted the csv into json
> (attached) to make it easy for loading in Geode.
> >
> > This allows me to run crazy queries like this :)
> >
> > SELECT "NAME", ST_Distance("Country", ST_GeomFromText('POINT(23.288269
> 42.731883)')) as "distanceToBG"
> > FROM (
> >   SELECT
> >"NAME",
> > ST_GeomFromText('POLYGON((4.822998 52.427652, 4.971313 52.427652,
> 4.971313 52.333661, 4.822998 52.333661, 4.822998 52.427652))') AS
> "Amsterdam",
> > ST_GeomFromText("WKT") AS "Country"
> >   FROM "geode"."Country"
> > )
> > WHERE ST_Contains("Country", "Amsterdam");
> >
> > E.g. retrieves the countries that contain the Amsterdam, NL area and for
> the result computes the distances to Sofia, BG. The result is actually
> correct :)
> >
> > | Netherlands | 18.93796871505074 |
> >
> > Although inefficient (all spatial computations happen on calcite side)
> it is still very cool! :)
> >
> > Btw the dataset license seems permissive and if you are interested 

Re: Handling Functions in custom adapter implementations?

2017-11-28 Thread Christian Tzolov
Yeh, unlike Cassandra, Geode support certain function types (such as CAST
and ITEM). To prevent dropping support for those, i've blacklisted just
the SqlTypeName.GEOMETRY RexNode for now.

It went a step further to hit the ST_Point parameter types issues :)

"No applicable constructor/method found for actual parameters "double,
double"; candidates are: "public static
org.apache.calcite.runtime.GeoFunctions$Geom
org.apache.calcite.runtime.GeoFunctions.ST_Point(java.math.BigDecimal,
java.math.BigDecimal, java.math.BigDecimal)", "public static
org.apache.calcite.runtime.GeoFunctions$Geom
org.apache.calcite.runtime.GeoFunctions.ST_Point(java.math.BigDecimal,
java.math.BigDecimal)""

This is expected as  ST_Point takes BigDecimal (e.g. DECIMAL sql type)
rather Double parameters.

While the "cast("loc"[0] AS DOUBLE)" works because "loc"[0] is a Numeric,
the "cast("loc"[0] AS DECIMAL)" doesn't because it tries to cast Numeric to
BigDecimal java types!

Is there other function (apart from CAST) to allow me to convert
the Numeric into BigDecimal?

Thanks,
Christian


On 28 November 2017 at 22:13, Michael Mior  wrote:

> Yes, the appropriate solution would be to check the expressions being
> projected and not to trigger your rule for projections you can't handle.
> For example, check out CassandraProjectRule#matches which validates that
> only field references are being projected.
>
> --
> Michael Mior
> mm...@apache.org
>
> 2017-11-28 15:39 GMT-05:00 Christian Tzolov :
>
> > Hey there,
> >
> > I have another question related to
> > ​ ​
> > handling
> > ​​
> > spatial (
> > ​or​
> > any) functions in custom adapter implementations.
> >
> > For e
> > ​ ​
> > example the
> > ​ ​
> > ​rel
> > . plan for the following  query
> > ​(​
> > with
> > ​​
> > spatial function
> > ​)​
> >
> >
> > EXPLAIN PLAN FOR SELECT
> >   "city",
> >   ST_Point(
> >   cast("loc" [0] AS DOUBLE),
> >   cast("loc" [1] AS DOUBLE)
> >   )
> > FROM "geode"."Zips"
> > LIMIT
> > 10;
> >
> > will ​produce plan like this:
> >
> >
> > GeodeToEnumerableConverterRel
> >   GeodeProjectRel(city=[$1], EXPR$1=[*ST_POIN*T(CAST(ITEM($2,
> 0)):DOUBLE,
> > CAST(ITEM($2, 1)):DOUBLE)])
> > GeodeSortRel(fetch=[10])
> >   GeodeTableScanRel(table=[[geode, Zips]])
> >
> > ​Given that my ​GeodeProjectRel implementation doesn't know about the
> > *ST_POIN*T this should fail?  Or somehow get ignored in my case.
> >
> > What is the correct way to handle such cases?
> >
> > Perhaps in my Project rule i should make sure that projection with
> function
> > expression should not be matched so it falls back to the the
> > EnumerableProjection?
> >
> > Thanks,
> > Christian
> >
>



-- 
Christian Tzolov <http://www.linkedin.com/in/tzolov> | Principle Software
Engineer | Spring <https://spring.io/>.io | Pivotal <http://pivotal.io/> |
ctzo...@pivotal.io


Handling Functions in custom adapter implementations?

2017-11-28 Thread Christian Tzolov
Hey there,

I have another question related to
​ ​
handling
​​
spatial (
​or​
any) functions in custom adapter implementations.

For e
​ ​
example the
​ ​
​rel
. plan for the following  query
​(​
with
​​
spatial function
​)​


EXPLAIN PLAN FOR SELECT
  "city",
  ST_Point(
  cast("loc" [0] AS DOUBLE),
  cast("loc" [1] AS DOUBLE)
  )
FROM "geode"."Zips"
LIMIT
10;

will ​produce plan like this:


GeodeToEnumerableConverterRel
  GeodeProjectRel(city=[$1], EXPR$1=[*ST_POIN*T(CAST(ITEM($2, 0)):DOUBLE,
CAST(ITEM($2, 1)):DOUBLE)])
GeodeSortRel(fetch=[10])
  GeodeTableScanRel(table=[[geode, Zips]])

​Given that my ​GeodeProjectRel implementation doesn't know about the
*ST_POIN*T this should fail?  Or somehow get ignored in my case.

What is the correct way to handle such cases?

Perhaps in my Project rule i should make sure that projection with function
expression should not be matched so it falls back to the the
EnumerableProjection?

Thanks,
Christian


Re: About the Spatial functionality

2017-11-28 Thread Christian Tzolov
​Ok, ​
I think i
​ solved the riddle​
.
​H
ad to remove
​the ​
quotes from
​the ​
function name (e.g. use ST_Point instead of "ST_Point"). This
​ is due to the ​
upCase=TURE parameter
​in
 ​
addFunctions
​ ​
.

I don't see the error anymore. Now i'm facing another issue i believe is
related with my adapter implementation.

Thanks for the support!

On 28 November 2017 at 18:43, Christian Tzolov  wrote:

> Unfortunately it didn't help still get " No match found for function
> signature ST_Point(, )"
> ​.
>
> ​Could it be that i need to ad some schema or other prefix? e.g.
> "geode"."ST_Point"(
>
> Also can i check interactively what are the registered functions? ​
>
> On 28 November 2017 at 18:33, Michael Mior  wrote:
>
>> I believe that should work. I'll let others correct me if I'm missing the
>> boat here.
>>
>> --
>> Michael Mior
>> mm...@apache.org
>>
>> 2017-11-28 12:31 GMT-05:00 Christian Tzolov :
>>
>> > Thanks @Michael!  Can i assume that
>> > ​ ​
>> > in
>> > ​ ​
>> > the SchemaFactory
>> > ​#​
>> > create(SchemaPlus parentSchema, String name,
>> > ​ ...​
>> > )
>> > ​ method ​the root schema is constructed?  And can i use the
>> > parentSchema
>> > ​ like this:
>> >
>> > ModelHandler.addFunctions(parentSchema, null,
>> ImmutableList.of(),
>> > ​ ​
>> > GeoFunctions.class.getName(), "*", true);
>> >
>> > On 28 November 2017 at 16:58, Michael Mior  wrote:
>> >
>> > > I believe the geospatial functions are not currently registered by
>> > default.
>> > > You can see an example of how to do this in CalciteAssert.java. Once
>> you
>> > > have constructed the root schema, the following should be sufficient:
>> > >
>> > > ModelHandler.addFunctions(rootSchema, null,
>> ImmutableList.of(),
>> > > GeoFunctions.class.getName(), "*", true);
>> > >
>> > > --
>> > > Michael Mior
>> > > mm...@apache.org
>> > >
>> > > 2017-11-28 4:27 GMT-05:00 Christian Tzolov :
>> > >
>> > > > I've tried to cast the Zip's loc column into double like this:
>> > > >
>> > > > SELECT
>> > > > ​ ​
>> > > > "city",  cast("loc" [0] AS DOUBLE) AS "lon",  cast("loc" [1] AS
>> DOUBLE)
>> > > AS
>> > > > "lat"
>> > > > ​ ​
>> > > > FROM "geode"."Zips"
>> > > > ​ ​
>> > > > LIMIT  10;
>> > > >
>> > > > ​This seems to work fine. ​But when i try to use the ST_Point
>> function
>> > i
>> > > > get: "No match found for function signature ST_Point(,
>> > > )"
>> > > > (full stack is below)
>> > > >
>> > > > It seems like i've not registered a jar dependency or haven't
>> enabled
>> > > > something else?
>> > > >
>> > > >
>> > > > jdbc:calcite:conformance=LENIENT> SELECT "city",
>> "ST_Point"(cast("loc"
>> > > [0]
>> > > > AS DOUBLE), cast("loc" [1] AS DOUBLE)) FROM "geode"."Zips"LIMIT 10;
>> > > > 2017-11-28 10:19:15,199 [main] ERROR -
>> > > > org.apache.calcite.sql.validate.SqlValidatorException: No match
>> found
>> > > for
>> > > > function signature ST_Point(, )
>> > > > 2017-11-28 10:19:15,199 [main] ERROR -
>> > > > org.apache.calcite.runtime.CalciteContextException: From line 1,
>> > column
>> > > 16
>> > > > to line 1, column 79: No match found for function signature
>> > > > ST_Point(, )
>> > > > Error: Error while executing SQL "SELECT "city",
>> "ST_Point"(cast("loc"
>> > > [0]
>> > > > AS DOUBLE), cast("loc" [1] AS DOUBLE)) FROM "geode"."Zips"LIMIT 10":
>> > From
>> > > > line 1, column 16 to line 1, column 79: No match found for function
>> > > > signature ST_Point(, ) (state=,code=0)
>> > > >
>> > > > On 28 November 2017 at 09:32, Christian Tzolov 
>> > > wrote:
>> > > >
>> > > > > @Julian are there 

Re: About the Spatial functionality

2017-11-28 Thread Christian Tzolov
Unfortunately it didn't help still get " No match found for function
signature ST_Point(, )"
​.

​Could it be that i need to ad some schema or other prefix? e.g.
"geode"."ST_Point"(

Also can i check interactively what are the registered functions? ​

On 28 November 2017 at 18:33, Michael Mior  wrote:

> I believe that should work. I'll let others correct me if I'm missing the
> boat here.
>
> --
> Michael Mior
> mm...@apache.org
>
> 2017-11-28 12:31 GMT-05:00 Christian Tzolov :
>
> > Thanks @Michael!  Can i assume that
> > ​ ​
> > in
> > ​ ​
> > the SchemaFactory
> > ​#​
> > create(SchemaPlus parentSchema, String name,
> > ​ ...​
> > )
> > ​ method ​the root schema is constructed?  And can i use the
> > parentSchema
> > ​ like this:
> >
> > ModelHandler.addFunctions(parentSchema, null,
> ImmutableList.of(),
> > ​ ​
> > GeoFunctions.class.getName(), "*", true);
> >
> > On 28 November 2017 at 16:58, Michael Mior  wrote:
> >
> > > I believe the geospatial functions are not currently registered by
> > default.
> > > You can see an example of how to do this in CalciteAssert.java. Once
> you
> > > have constructed the root schema, the following should be sufficient:
> > >
> > > ModelHandler.addFunctions(rootSchema, null,
> ImmutableList.of(),
> > > GeoFunctions.class.getName(), "*", true);
> > >
> > > --
> > > Michael Mior
> > > mm...@apache.org
> > >
> > > 2017-11-28 4:27 GMT-05:00 Christian Tzolov :
> > >
> > > > I've tried to cast the Zip's loc column into double like this:
> > > >
> > > > SELECT
> > > > ​ ​
> > > > "city",  cast("loc" [0] AS DOUBLE) AS "lon",  cast("loc" [1] AS
> DOUBLE)
> > > AS
> > > > "lat"
> > > > ​ ​
> > > > FROM "geode"."Zips"
> > > > ​ ​
> > > > LIMIT  10;
> > > >
> > > > ​This seems to work fine. ​But when i try to use the ST_Point
> function
> > i
> > > > get: "No match found for function signature ST_Point(,
> > > )"
> > > > (full stack is below)
> > > >
> > > > It seems like i've not registered a jar dependency or haven't enabled
> > > > something else?
> > > >
> > > >
> > > > jdbc:calcite:conformance=LENIENT> SELECT "city",
> "ST_Point"(cast("loc"
> > > [0]
> > > > AS DOUBLE), cast("loc" [1] AS DOUBLE)) FROM "geode"."Zips"LIMIT 10;
> > > > 2017-11-28 10:19:15,199 [main] ERROR -
> > > > org.apache.calcite.sql.validate.SqlValidatorException: No match
> found
> > > for
> > > > function signature ST_Point(, )
> > > > 2017-11-28 10:19:15,199 [main] ERROR -
> > > > org.apache.calcite.runtime.CalciteContextException: From line 1,
> > column
> > > 16
> > > > to line 1, column 79: No match found for function signature
> > > > ST_Point(, )
> > > > Error: Error while executing SQL "SELECT "city",
> "ST_Point"(cast("loc"
> > > [0]
> > > > AS DOUBLE), cast("loc" [1] AS DOUBLE)) FROM "geode"."Zips"LIMIT 10":
> > From
> > > > line 1, column 16 to line 1, column 79: No match found for function
> > > > signature ST_Point(, ) (state=,code=0)
> > > >
> > > > On 28 November 2017 at 09:32, Christian Tzolov 
> > > wrote:
> > > >
> > > > > @Julian are there some tests, json datasets? Perhaps in
> > > > > calcite-test-dataset?
> > > > > Also I will try to cast the "loc" from Zips into DOUBLE columns to
> > test
> > > > > the ST_Point
> > > > >
> > > > > On 28 November 2017 at 02:24, Julian Hyde 
> wrote:
> > > > >
> > > > >> It’s true that you can’t define a GEOMETRY column in a foreign
> > table.
> > > > But
> > > > >> you can define a VARCHAR column and apply the ST_GeomFromText to
> it,
> > > or
> > > > if
> > > > >> you want a point you can define a pair of DOUBLE columns and apply
> > the
> > > > >> ST_Point function.
> > > > >>
> > > > >> In essence, 

Re: About the Spatial functionality

2017-11-28 Thread Christian Tzolov
Thanks @Michael!  Can i assume that
​ ​
in
​ ​
the SchemaFactory
​#​
create(SchemaPlus parentSchema, String name,
​ ...​
)
​ method ​the root schema is constructed?  And can i use the
parentSchema
​ like this:

ModelHandler.addFunctions(parentSchema, null, ImmutableList.of(),
​ ​
GeoFunctions.class.getName(), "*", true);

On 28 November 2017 at 16:58, Michael Mior  wrote:

> I believe the geospatial functions are not currently registered by default.
> You can see an example of how to do this in CalciteAssert.java. Once you
> have constructed the root schema, the following should be sufficient:
>
> ModelHandler.addFunctions(rootSchema, null, ImmutableList.of(),
> GeoFunctions.class.getName(), "*", true);
>
> --
> Michael Mior
> mm...@apache.org
>
> 2017-11-28 4:27 GMT-05:00 Christian Tzolov :
>
> > I've tried to cast the Zip's loc column into double like this:
> >
> > SELECT
> > ​ ​
> > "city",  cast("loc" [0] AS DOUBLE) AS "lon",  cast("loc" [1] AS DOUBLE)
> AS
> > "lat"
> > ​ ​
> > FROM "geode"."Zips"
> > ​ ​
> > LIMIT  10;
> >
> > ​This seems to work fine. ​But when i try to use the ST_Point function i
> > get: "No match found for function signature ST_Point(,
> )"
> > (full stack is below)
> >
> > It seems like i've not registered a jar dependency or haven't enabled
> > something else?
> >
> >
> > jdbc:calcite:conformance=LENIENT> SELECT "city", "ST_Point"(cast("loc"
> [0]
> > AS DOUBLE), cast("loc" [1] AS DOUBLE)) FROM "geode"."Zips"LIMIT 10;
> > 2017-11-28 10:19:15,199 [main] ERROR -
> > org.apache.calcite.sql.validate.SqlValidatorException: No match found
> for
> > function signature ST_Point(, )
> > 2017-11-28 10:19:15,199 [main] ERROR -
> > org.apache.calcite.runtime.CalciteContextException: From line 1, column
> 16
> > to line 1, column 79: No match found for function signature
> > ST_Point(, )
> > Error: Error while executing SQL "SELECT "city", "ST_Point"(cast("loc"
> [0]
> > AS DOUBLE), cast("loc" [1] AS DOUBLE)) FROM "geode"."Zips"LIMIT 10": From
> > line 1, column 16 to line 1, column 79: No match found for function
> > signature ST_Point(, ) (state=,code=0)
> >
> > On 28 November 2017 at 09:32, Christian Tzolov 
> wrote:
> >
> > > @Julian are there some tests, json datasets? Perhaps in
> > > calcite-test-dataset?
> > > Also I will try to cast the "loc" from Zips into DOUBLE columns to test
> > > the ST_Point
> > >
> > > On 28 November 2017 at 02:24, Julian Hyde  wrote:
> > >
> > >> It’s true that you can’t define a GEOMETRY column in a foreign table.
> > But
> > >> you can define a VARCHAR column and apply the ST_GeomFromText to it,
> or
> > if
> > >> you want a point you can define a pair of DOUBLE columns and apply the
> > >> ST_Point function.
> > >>
> > >> In essence, our implementation of GEOMETRY is only an in-memory format
> > >> right now, not an on-disk format. It’s a little less efficient than a
> > >> native GEOMETRY data type but hopefully over time we will write
> > optimizer
> > >> rules that push down filters etc. so we don’t literally construct an
> > >> in-memory geometry object for every row, only the rows we are
> > interested in.
> > >>
> > >> Julian
> > >>
> > >> > On Nov 27, 2017, at 2:59 AM, Christian Tzolov 
> > >> wrote:
> > >> >
> > >> > Hey there,
> > >> >
> > >> > I'm exploring the new Spatial (https://calcite.apache.org/do
> > >> cs/spatial.html)
> > >> > functionality and i've been trying to figure out what are the
> minimal
> > >> > requirements for using it with my custom adapter.
> > >> >
> > >> > Following the guidelines i've set LENIENT  conformance in my jdbc
> URL
> > (
> > >> > jdbc:calcite:conformance=LENIENT;
> > >> > ​model=...my model​
> > >> > ​
> > >> > ​
> > >> > ​)
> > >> >
> > >> > But I am not sure how define the GEOMETRY column types?​
> > >> >
> > >> > Currently my custom Schema/Table factory implementation infers the
> > >> column
> > >> > types from the underlaying system's field types.
> > >> >
> > >> > So it seems that i need to change my implementation and somehow to
> > hint
> > >> > which fields needs to be mapped to GEOMETRY types?  Or perhaps i can
> > >> try to
> > >> > do some expensive casting in SQL?
> > >> >
> > >> > Are there any guidelines, examples ​for using Spatial functionality
> on
> > >> 3rd
> > >> > party (e.g. custom) adapters?
> > >> >
> > >> > Thanks,
> > >> > Christian
> > >>
> > >>
> > >
> > >
> > > --
> > > Christian Tzolov <http://www.linkedin.com/in/tzolov> | Principle
> > Software
> > > Engineer | Spring <https://spring.io/>.io | Pivotal <
> http://pivotal.io/>
> > > | ctzo...@pivotal.io
> > >
> >
> >
> >
> > --
> > Christian Tzolov <http://www.linkedin.com/in/tzolov> | Principle
> Software
> > Engineer | Spring <https://spring.io/>.io | Pivotal <http://pivotal.io/>
> |
> > ctzo...@pivotal.io
> >
>



-- 
Christian Tzolov <http://www.linkedin.com/in/tzolov> | Principle Software
Engineer | Spring <https://spring.io/>.io | Pivotal <http://pivotal.io/> |
ctzo...@pivotal.io


Re: About the Spatial functionality

2017-11-28 Thread Christian Tzolov
I've tried to cast the Zip's loc column into double like this:

SELECT
​ ​
"city",  cast("loc" [0] AS DOUBLE) AS "lon",  cast("loc" [1] AS DOUBLE) AS
"lat"
​ ​
FROM "geode"."Zips"
​ ​
LIMIT  10;

​This seems to work fine. ​But when i try to use the ST_Point function i
get: "No match found for function signature ST_Point(, )"
(full stack is below)

It seems like i've not registered a jar dependency or haven't enabled
something else?


jdbc:calcite:conformance=LENIENT> SELECT "city", "ST_Point"(cast("loc" [0]
AS DOUBLE), cast("loc" [1] AS DOUBLE)) FROM "geode"."Zips"LIMIT 10;
2017-11-28 10:19:15,199 [main] ERROR -
org.apache.calcite.sql.validate.SqlValidatorException: No match found for
function signature ST_Point(, )
2017-11-28 10:19:15,199 [main] ERROR -
org.apache.calcite.runtime.CalciteContextException: From line 1, column 16
to line 1, column 79: No match found for function signature
ST_Point(, )
Error: Error while executing SQL "SELECT "city", "ST_Point"(cast("loc" [0]
AS DOUBLE), cast("loc" [1] AS DOUBLE)) FROM "geode"."Zips"LIMIT 10": From
line 1, column 16 to line 1, column 79: No match found for function
signature ST_Point(, ) (state=,code=0)

On 28 November 2017 at 09:32, Christian Tzolov  wrote:

> @Julian are there some tests, json datasets? Perhaps in
> calcite-test-dataset?
> Also I will try to cast the "loc" from Zips into DOUBLE columns to test
> the ST_Point
>
> On 28 November 2017 at 02:24, Julian Hyde  wrote:
>
>> It’s true that you can’t define a GEOMETRY column in a foreign table. But
>> you can define a VARCHAR column and apply the ST_GeomFromText to it, or if
>> you want a point you can define a pair of DOUBLE columns and apply the
>> ST_Point function.
>>
>> In essence, our implementation of GEOMETRY is only an in-memory format
>> right now, not an on-disk format. It’s a little less efficient than a
>> native GEOMETRY data type but hopefully over time we will write optimizer
>> rules that push down filters etc. so we don’t literally construct an
>> in-memory geometry object for every row, only the rows we are interested in.
>>
>> Julian
>>
>> > On Nov 27, 2017, at 2:59 AM, Christian Tzolov 
>> wrote:
>> >
>> > Hey there,
>> >
>> > I'm exploring the new Spatial (https://calcite.apache.org/do
>> cs/spatial.html)
>> > functionality and i've been trying to figure out what are the minimal
>> > requirements for using it with my custom adapter.
>> >
>> > Following the guidelines i've set LENIENT  conformance in my jdbc URL (
>> > jdbc:calcite:conformance=LENIENT;
>> > ​model=...my model​
>> > ​
>> > ​
>> > ​)
>> >
>> > But I am not sure how define the GEOMETRY column types?​
>> >
>> > Currently my custom Schema/Table factory implementation infers the
>> column
>> > types from the underlaying system's field types.
>> >
>> > So it seems that i need to change my implementation and somehow to hint
>> > which fields needs to be mapped to GEOMETRY types?  Or perhaps i can
>> try to
>> > do some expensive casting in SQL?
>> >
>> > Are there any guidelines, examples ​for using Spatial functionality on
>> 3rd
>> > party (e.g. custom) adapters?
>> >
>> > Thanks,
>> > Christian
>>
>>
>
>
> --
> Christian Tzolov <http://www.linkedin.com/in/tzolov> | Principle Software
> Engineer | Spring <https://spring.io/>.io | Pivotal <http://pivotal.io/>
> | ctzo...@pivotal.io
>



-- 
Christian Tzolov <http://www.linkedin.com/in/tzolov> | Principle Software
Engineer | Spring <https://spring.io/>.io | Pivotal <http://pivotal.io/> |
ctzo...@pivotal.io


Re: About the Spatial functionality

2017-11-28 Thread Christian Tzolov
@Julian are there some tests, json datasets? Perhaps in
calcite-test-dataset?
Also I will try to cast the "loc" from Zips into DOUBLE columns to test the
ST_Point

On 28 November 2017 at 02:24, Julian Hyde  wrote:

> It’s true that you can’t define a GEOMETRY column in a foreign table. But
> you can define a VARCHAR column and apply the ST_GeomFromText to it, or if
> you want a point you can define a pair of DOUBLE columns and apply the
> ST_Point function.
>
> In essence, our implementation of GEOMETRY is only an in-memory format
> right now, not an on-disk format. It’s a little less efficient than a
> native GEOMETRY data type but hopefully over time we will write optimizer
> rules that push down filters etc. so we don’t literally construct an
> in-memory geometry object for every row, only the rows we are interested in.
>
> Julian
>
> > On Nov 27, 2017, at 2:59 AM, Christian Tzolov 
> wrote:
> >
> > Hey there,
> >
> > I'm exploring the new Spatial (https://calcite.apache.org/
> docs/spatial.html)
> > functionality and i've been trying to figure out what are the minimal
> > requirements for using it with my custom adapter.
> >
> > Following the guidelines i've set LENIENT  conformance in my jdbc URL (
> > jdbc:calcite:conformance=LENIENT;
> > ​model=...my model​
> > ​
> > ​
> > ​)
> >
> > But I am not sure how define the GEOMETRY column types?​
> >
> > Currently my custom Schema/Table factory implementation infers the column
> > types from the underlaying system's field types.
> >
> > So it seems that i need to change my implementation and somehow to hint
> > which fields needs to be mapped to GEOMETRY types?  Or perhaps i can try
> to
> > do some expensive casting in SQL?
> >
> > Are there any guidelines, examples ​for using Spatial functionality on
> 3rd
> > party (e.g. custom) adapters?
> >
> > Thanks,
> > Christian
>
>


-- 
Christian Tzolov <http://www.linkedin.com/in/tzolov> | Principle Software
Engineer | Spring <https://spring.io/>.io | Pivotal <http://pivotal.io/> |
ctzo...@pivotal.io


About the Spatial functionality

2017-11-27 Thread Christian Tzolov
Hey there,

I'm exploring the new Spatial (https://calcite.apache.org/docs/spatial.html)
functionality and i've been trying to figure out what are the minimal
requirements for using it with my custom adapter.

Following the guidelines i've set LENIENT  conformance in my jdbc URL (
jdbc:calcite:conformance=LENIENT;
​model=...my model​
​
​
​)

But I am not sure how define the GEOMETRY column types?​

Currently my custom Schema/Table factory implementation infers the column
types from the underlaying system's field types.

So it seems that i need to change my implementation and somehow to hint
which fields needs to be mapped to GEOMETRY types?  Or perhaps i can try to
do some expensive casting in SQL?

Are there any guidelines, examples ​for using Spatial functionality on 3rd
party (e.g. custom) adapters?

Thanks,
Christian


Re: Apache Zeppelin with Calcite JDBC adapters

2017-11-23 Thread Christian Tzolov
Hi Julian, Indeed i believe that ability to set the autoCommit through the
jdbc URL should solve this.
Here is the ticket: https://issues.apache.org/jira/browse/CALCITE-2065


On 23 November 2017 at 00:37, Julian Hyde  wrote:

> Can you log that a JIRA case to make “autoCommit” an accepted parameter in
> the JDBC URL? I think it would solve this problem.
>
> > On Nov 22, 2017, at 8:52 AM, Christian Tzolov 
> wrote:
> >
> > Hi Josh,
> >
> > Here
> > ​is
> > ​Z
> > eppelin
> > ​'s JDBC Interpreter forcefull auto-commit implementation​
> > :
> > http://bit.ly/2zrykP9
> >
> > ​So if the connection is not auto-commit = true the JDBC interpreter will
> > forcefully call commit.
> >
> > But Avatica's connection defaults to auto-commit = true and there is no
> way
> > to configure it from the properties! I can not call
> > Connection#setAutoCommit(boolean) without the Zeppelin's generic JDBC
> > interpret implementation. There is ZEPPELIN-2451 issue that tries to
> > address this but i'm not sure the approach is right (see my comment in
> > https://github.com/apache/zeppelin/pull/2396)
> >
> > The connection.commit() called by Zeppelin hits the
> CalciteMetaImpl#commit
> > ​(). Later throws an explicit ​
> > UnsupportedOperationException
> > ​. E.g. ​
> >
> > @Override public void commit(ConnectionHandle ch) {
> >  throw new UnsupportedOperationException();
> > }
> >
> >
> >
> > On 22 November 2017 at 17:11, Josh Elser  wrote:
> >
> >> Hey Christian,
> >>
> >> Thanks for sharing this. Sounds cool.
> >>
> >> I'm curious what you mean when you say that the Avatica connection
> doesn't
> >> support commit. This was implemented in CALCITE-767.
> >>
> >> Also, is there a reason that Zeppelin needs a property to control
> >> autoCommit and can't use the Connection#setAutoCommit(boolean) method?
> >>
> >>
> >> On 11/22/17 8:09 AM, Christian Tzolov wrote:
> >>
> >>> FYI,
> >>>
> >>> you can use the Apache Zeppelin's generic JDBC interpreter to plug a
> >>> Calcite based JDBC adapter.
> >>>
> >>> Here are my results of plug-in the Geode-Calcite adapter:
> >>> https://www.linkedin.com/pulse/advanced-apache-geode-data-
> >>> analytics-zeppelin-over-sqljdbc-tzolov/
> >>>
> >>> Although i've been testing with particular adapter i believe the
> approach
> >>> and the findings would be applicable for any Avatica JDBC adapter too.
> >>>
> >>> The only minor but blocking !!! issue is related to the fact that
> >>> Zeppelin's JDBC interpreter tries to 'commit' the jdbc connection if
> later
> >>> is not set as 'auto-commit=true'. But the Avatica connection does NOT
> >>> support commits.
> >>>
> >>> At the moment there is no way to switch this behavior off on the
> Zeppelin
> >>> side. The ZEPPELIN-2451 issue and PR hopefully will resolve this.
> >>>
> >>> Unfortunately Avatica doesn't provide properties to set the
> auto-commit to
> >>> true (default is hardcoded to false).
> >>>
> >>> As a temporal solution you can build Zeppelin from my fork (
> >>> https://github.com/tzolov/zeppelin/tree/calcite) and play with
> Zeppelin
> >>> and
> >>> Calcite adapters.
> >>>
> >>> Cheers,
> >>> Christian
> >>>
> >>>
> >
> >
> > --
> > Christian Tzolov <http://www.linkedin.com/in/tzolov> | Principle
> Software
> > Engineer | Pivotal <http://pivotal.io/> | ctzo...@pivotal.io |
> +31610285517
>
>


-- 
Christian Tzolov <http://www.linkedin.com/in/tzolov> | Principle Software
Engineer | Pivotal <http://pivotal.io/> | ctzo...@pivotal.io |+31610285517


[jira] [Created] (CALCITE-2065) Make “autoCommit” an accepted parameter in the JDBC URL

2017-11-23 Thread Christian Tzolov (JIRA)
Christian Tzolov created CALCITE-2065:
-

 Summary: Make “autoCommit” an accepted parameter in the JDBC URL
 Key: CALCITE-2065
 URL: https://issues.apache.org/jira/browse/CALCITE-2065
 Project: Calcite
  Issue Type: Improvement
  Components: avatica
Reporter: Christian Tzolov


At the moment there is no declarative way to set connection's "autoCommit" 
property. It defaults to `false`. 
There are couple of use cases (like apache zeppelin integration) where the 
ability to set the autocommit property to true are required. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Re: Apache Zeppelin with Calcite JDBC adapters

2017-11-22 Thread Christian Tzolov
Hi Josh,

Here
​is
​Z
eppelin
​'s JDBC Interpreter forcefull auto-commit implementation​
:
http://bit.ly/2zrykP9

​So if the connection is not auto-commit = true the JDBC interpreter will
forcefully call commit.

But Avatica's connection defaults to auto-commit = true and there is no way
to configure it from the properties! I can not call
Connection#setAutoCommit(boolean) without the Zeppelin's generic JDBC
interpret implementation. There is ZEPPELIN-2451 issue that tries to
address this but i'm not sure the approach is right (see my comment in
https://github.com/apache/zeppelin/pull/2396)

The connection.commit() called by Zeppelin hits the CalciteMetaImpl#commit
​(). Later throws an explicit ​
UnsupportedOperationException
​. E.g. ​

@Override public void commit(ConnectionHandle ch) {
  throw new UnsupportedOperationException();
}



On 22 November 2017 at 17:11, Josh Elser  wrote:

> Hey Christian,
>
> Thanks for sharing this. Sounds cool.
>
> I'm curious what you mean when you say that the Avatica connection doesn't
> support commit. This was implemented in CALCITE-767.
>
> Also, is there a reason that Zeppelin needs a property to control
> autoCommit and can't use the Connection#setAutoCommit(boolean) method?
>
>
> On 11/22/17 8:09 AM, Christian Tzolov wrote:
>
>> FYI,
>>
>> you can use the Apache Zeppelin's generic JDBC interpreter to plug a
>> Calcite based JDBC adapter.
>>
>> Here are my results of plug-in the Geode-Calcite adapter:
>> https://www.linkedin.com/pulse/advanced-apache-geode-data-
>> analytics-zeppelin-over-sqljdbc-tzolov/
>>
>> Although i've been testing with particular adapter i believe the approach
>> and the findings would be applicable for any Avatica JDBC adapter too.
>>
>> The only minor but blocking !!! issue is related to the fact that
>> Zeppelin's JDBC interpreter tries to 'commit' the jdbc connection if later
>> is not set as 'auto-commit=true'. But the Avatica connection does NOT
>> support commits.
>>
>> At the moment there is no way to switch this behavior off on the Zeppelin
>> side. The ZEPPELIN-2451 issue and PR hopefully will resolve this.
>>
>> Unfortunately Avatica doesn't provide properties to set the auto-commit to
>> true (default is hardcoded to false).
>>
>> As a temporal solution you can build Zeppelin from my fork (
>> https://github.com/tzolov/zeppelin/tree/calcite) and play with Zeppelin
>> and
>> Calcite adapters.
>>
>> Cheers,
>> Christian
>>
>>


-- 
Christian Tzolov <http://www.linkedin.com/in/tzolov> | Principle Software
Engineer | Pivotal <http://pivotal.io/> | ctzo...@pivotal.io |+31610285517


Apache Zeppelin with Calcite JDBC adapters

2017-11-22 Thread Christian Tzolov
FYI,

you can use the Apache Zeppelin's generic JDBC interpreter to plug a
Calcite based JDBC adapter.

Here are my results of plug-in the Geode-Calcite adapter:
https://www.linkedin.com/pulse/advanced-apache-geode-data-analytics-zeppelin-over-sqljdbc-tzolov/

Although i've been testing with particular adapter i believe the approach
and the findings would be applicable for any Avatica JDBC adapter too.

The only minor but blocking !!! issue is related to the fact that
Zeppelin's JDBC interpreter tries to 'commit' the jdbc connection if later
is not set as 'auto-commit=true'. But the Avatica connection does NOT
support commits.

At the moment there is no way to switch this behavior off on the Zeppelin
side. The ZEPPELIN-2451 issue and PR hopefully will resolve this.

Unfortunately Avatica doesn't provide properties to set the auto-commit to
true (default is hardcoded to false).

As a temporal solution you can build Zeppelin from my fork (
https://github.com/tzolov/zeppelin/tree/calcite) and play with Zeppelin and
Calcite adapters.

Cheers,
Christian

-- 
Christian Tzolov <http://www.linkedin.com/in/tzolov> | Principle Software
Engineer | Pivotal <http://pivotal.io/> | ctzo...@pivotal.io |+31610285517


Setting connection's auto-commit property from the configuration files?

2017-11-18 Thread Christian Tzolov
Is there a way to set the ConnectionProperties#setAutoCommit(boolean val)
from the model.json or perhaps via the jdbc URL attributes?

E.g. i'm looking for any configuration option that wouldn't require writing
code such as Connection.setAutoCommit(...)

-- 
Christian Tzolov <http://www.linkedin.com/in/tzolov> | Principle Software
Engineer | Pivotal <http://pivotal.io/> | ctzo...@pivotal.io |+31610285517


[jira] [Created] (CALCITE-2059) Apache Geode adapter

2017-11-17 Thread Christian Tzolov (JIRA)
Christian Tzolov created CALCITE-2059:
-

 Summary: Apache Geode adapter
 Key: CALCITE-2059
 URL: https://issues.apache.org/jira/browse/CALCITE-2059
 Project: Calcite
  Issue Type: New Feature
Reporter: Christian Tzolov
Assignee: Julian Hyde


I've been working on a Calcite adapter for [Apache 
Geode|http://geode.apache.org]. 
Current implementation uses the plain Geode API and 
[OQL|http://geode.apache.org/docs/guide/13/developing/querying_basics/chapter_overview.html](Object
 Query Interface) to push down relational expressions such as projections, 
filtering, sorting, and grouping . 

Provided functionality can hopefully address certain Geode use cases and will 
provide a stepping stone for future improvements. 

Here are some remaining tasks as i see it:
* New tests for test suite (and update calcite-test-dataset to support Geode)
* Add Integration tests that use calcite-test-dataset
* Documentation



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Re: Network access to /vlsi/calcite-test-dataset VM

2017-11-12 Thread Christian Tzolov
I am still not sure how do you connect the existing Adapter Integration
Tests to the ubuntucalcite VM (https://github.com/vlsi/calcite-test-dataset
)?

The only way i managed to access this
​VM
 is by setting
​a private network in the Vagrantfile like this:​

​
ubuntucalcite.vm.network :private_network, ip: "10.211.55.1"

This
​adds additional "Host-only Adapter"​
​in the VM that is visible on the host machine as
 virtualbox1 interface with an explicit IP
​.


vboxnet1: flags=8943 mtu
1500
​  ​
ether 0a:00:27:00:00:01
​  ​
inet *10.211.55.1* netmask 0xff00 broadcast 10.211.55.255

Using the IP I can
connect to
​ and access the services on the VM. ​

B
ut i guess you have another way to do this without private_network
definitions? ​






On 11 November 2017 at 12:52, Christian Tzolov  wrote:

> I'm adding Apache Geode provisioning to the /vlsi/calcite-test-dataset
> test VM
>
> I can build and start the Vagrant environment, i can
> "vagrant ssh" and verify that all the services are running that the
> required ports are forwarded.
>
> But i can't access those services from the host (MacOSX) machine. The
> ifconfig shows:
>
> vboxnet0: flags=8843 mtu 1500
>ether 0a:00:27:00:00:00
>
> So i am not sure if this interface is accessible and one which IP address?
>
> How do you run the IT tests against the calcite-test-dataset (e.g.
> ubuntucalcite) VM?
>
> Cheers,
> Christian
>
>


Network access to /vlsi/calcite-test-dataset VM

2017-11-11 Thread Christian Tzolov
I'm adding Apache Geode provisioning to the /vlsi/calcite-test-dataset test
VM

I can build and start the Vagrant environment, i can
"vagrant ssh" and verify that all the services are running that the
required ports are forwarded.

But i can't access those services from the host (MacOSX) machine. The
ifconfig shows:

vboxnet0: flags=8843 mtu 1500
   ether 0a:00:27:00:00:00

So i am not sure if this interface is accessible and one which IP address?

How do you run the IT tests against the calcite-test-dataset (e.g.
ubuntucalcite) VM?

Cheers,
Christian


Re: Project + Sort on single and on multiple columns

2017-11-11 Thread Christian Tzolov
@Julian,  this exactly what i would expect as logical behavior.

In fact it (consistently) behaves like this (e.g. Sort is after Project)
when the Sort is on a single column! But for some reasons if the Sort is
performed on two columns (same used in the Project) then the Sort is
performed before the Project!

This would be an expected if the Sort columns are not contained in the
Project such. So i wonder could it be that somewhere during the Sort
optimization the this column check gets confused?

On 11 November 2017 at 03:05, Julian Hyde  wrote:

> While the cost of Project does not depend heavily on the number of input
> columns, the cost of Sort (or at least a typical Sort algorithm such as
> external merge sort) does depend on the number of columns (or more
> precisely on the average row size in bytes). So, if the Project reduces the
> number of columns (as most Projects do) then the Sort will have lower cost
> if performed after the Project, because it is handling fewer bytes.
>
>
> > On Nov 10, 2017, at 10:32 AM, Luis Fernando Kauer
>  wrote:
> >
> > I'm trying to fix https://issues.apache.org/jira/browse/CALCITE-1906
> and I'm facing a similar problem.
> > After managing to make JdbcSort to work, sometimes the JdbcProject is
> above JdbcSort and the generated SQL is wrong because RelToSqlConverter
> uses SqlImplementator.Clause enum to decide when to create subqueries, but
> since ORDER_BY is after SELECT, once it gets to JdbcProject it can't use
> the same query because it already used ORDER_BY.
> > The rule responsable for this is SortProjectTransposeRule.  The opposite
> rule is ProjectSortTransposeRule, but this one only matches if the sort
> node is exactly Sort.class, so it ends up not matching.
> > Is pushing the Project above Sort usually a good final plan or is it
> done to allow other rules to match?  If it is not, maybe we should solve
> this in the core project.
> >
> >
> >
> >Em sexta-feira, 10 de novembro de 2017 15:57:34 BRST, Michael Mior <
> mm...@uwaterloo.ca> escreveu:
> >
> > Since the cost of the project doesn't depend on the number of columns
> being
> > projected or the size of the input, putting the project before or after
> the
> > sort will result in the same estimated cost. One approach would be to
> scale
> > the cost of the projection based on the fraction of columns projected.
> >
> > --
> > Michael Mior
> > mm...@apache.org
> >
> > 2017-11-10 12:42 GMT-05:00 Christian Tzolov :
> >
> >> ​I've observed in my
> >> no-sql adapter
> >> ​ implementation that for q
> >> ueries with
> >> ​P
> >> roject
> >> ​ +
> >> ​S
> >> ort by
> >> ​ONE
> >>   column
> >> ​t​
> >> he
> >> ​Project
> >>
> >> ​is pushed (as expected) ​
> >> before the Sort but for Sort
> >> ​on MULTIPLE
> >>   columns
> >> ​the Sort is before the Project.
> >> For example
> >> ​for a query with one
> >> sort column:
> >>
> >> SELECT yearPublished FROM BookMaster ORDER BY yearPublished ASC
> >>
> >> ​The plan looks like expected (project before the sort)
> >>
> >>
> >> PLAN=GeodeToEnumerableConverterRel
> >>   *GeodeSortRel*(sort0=[$0], dir0=[ASC])
> >> GeodeProjectRel(yearPublished=[$2])
> >> GeodeTableScanRel(table=[[TEST, BookMaster]])
> >>
> >> But
> >> ​ for sort​
> >> with
> >> ​two​
> >> ​
> >> columns:
> >>
> >> SELECT yearPublished, itemNumber from BookMaster ORDER BY yearPublished
> >> ASC, itemNumber ASC
> >>
> >> The
> >> ​the plan is:
> >>
> >>
> >> PLAN=GeodeToEnumerableConverterRel
> >>   GeodeProjectRel(yearPublished=[$2], itemNumber=[$0])
> >> *GeodeSortRel*(sort0=[$2], sort1=[$0], dir0=[ASC], dir1=[ASC])
> >>   GeodeTableScanRel(table=[[TEST, BookMaster]])
> >>
> >> I'm not sure i can explain
> >> ​ why in the second case the Sort appears before the Project?
> >> Here
> >> ​are my cost functions:
> >>
> >> ​* ​
> >> GeodeSortRel
> >> ​:
> >> https://github.com/tzolov/calcite/blob/geode-1.3/geode/
> >> src/main/java/org/apache/calcite/adapter/geode/rel/
> GeodeSortRel.java#L51
> >>
> >> * GoedeProjectRel:
> >> https://github.com/tzolov/calcite/blob/4a631d9055340f64f5e644454551f9
> >> 64ea08f9de/geode/src/main/java/org/apache/calcite/adapter/geode/rel/
> >> GeodeProjectRel.java#L52
> >> ​
> >>
> >> ​
> >> ​Cheers,
> >> Christian​
>
>


-- 
Christian Tzolov <http://www.linkedin.com/in/tzolov> | Principle Software
Engineer | Pivotal <http://pivotal.io/> | ctzo...@pivotal.io |+31610285517


Re: Project + Sort on single and on multiple columns

2017-11-10 Thread Christian Tzolov
Not sure i understand. Lets say that we have rows of size 10MB each. IMO it
is not the same if we perform the sort on the complete row (e.g. 10MB) or
on subset left after the Project. E.g. my intuition is that the Project
will shrink the size and therefore should be performed before the Sort (As
it does for single column sort)
Or is this intuition wrong?


On 10 November 2017 at 18:57, Michael Mior  wrote:

> Since the cost of the project doesn't depend on the number of columns being
> projected or the size of the input, putting the project before or after the
> sort will result in the same estimated cost. One approach would be to scale
> the cost of the projection based on the fraction of columns projected.
>
> --
> Michael Mior
> mm...@apache.org
>
> 2017-11-10 12:42 GMT-05:00 Christian Tzolov :
>
> > ​I've observed in my
> > no-sql adapter
> > ​ implementation that for q
> > ueries with
> > ​P
> > roject
> > ​ +
> > ​S
> > ort by
> > ​ONE
> >  column
> > ​t​
> > he
> > ​Project
> >
> > ​is pushed (as expected) ​
> > before the Sort but for Sort
> > ​on MULTIPLE
> >  columns
> > ​the Sort is before the Project.
> > For example
> > ​for a query with one
> > sort column:
> >
> > SELECT yearPublished FROM BookMaster ORDER BY yearPublished ASC
> >
> > ​The plan looks like expected (project before the sort)
> >
> >
> > PLAN=GeodeToEnumerableConverterRel
> >   *GeodeSortRel*(sort0=[$0], dir0=[ASC])
> >GeodeProjectRel(yearPublished=[$2])
> >GeodeTableScanRel(table=[[TEST, BookMaster]])
> >
> > But
> > ​ for sort​
> > with
> > ​two​
> > ​
> > columns:
> >
> > SELECT yearPublished, itemNumber from BookMaster ORDER BY yearPublished
> > ASC, itemNumber ASC
> >
> > The
> > ​the plan is:
> >
> >
> > PLAN=GeodeToEnumerableConverterRel
> >   GeodeProjectRel(yearPublished=[$2], itemNumber=[$0])
> > *GeodeSortRel*(sort0=[$2], sort1=[$0], dir0=[ASC], dir1=[ASC])
> >   GeodeTableScanRel(table=[[TEST, BookMaster]])
> >
> > I'm not sure i can explain
> > ​ why in the second case the Sort appears before the Project?
> > Here
> > ​are my cost functions:
> >
> > ​* ​
> > GeodeSortRel
> > ​:
> > https://github.com/tzolov/calcite/blob/geode-1.3/geode/
> > src/main/java/org/apache/calcite/adapter/geode/rel/GeodeSortRel.java#L51
> >
> > * GoedeProjectRel:
> > https://github.com/tzolov/calcite/blob/4a631d9055340f64f5e644454551f9
> > 64ea08f9de/geode/src/main/java/org/apache/calcite/adapter/geode/rel/
> > GeodeProjectRel.java#L52
> > ​
> >
> > ​
> > ​Cheers,
> > Christian​
> >
>



-- 
Christian Tzolov <http://www.linkedin.com/in/tzolov> | Principle Software
Engineer | Pivotal <http://pivotal.io/> | ctzo...@pivotal.io |+31610285517


Project + Sort on single and on multiple columns

2017-11-10 Thread Christian Tzolov
​I've observed in my
no-sql adapter
​ implementation that for q
ueries with
​P
roject
​ +
​S
ort by
​ONE
 column
​t​
he
​Project

​is pushed (as expected) ​
before the Sort but for Sort
​on MULTIPLE
 columns
​the Sort is before the Project.
For example
​for a query with one
sort column:

SELECT yearPublished FROM BookMaster ORDER BY yearPublished ASC

​The plan looks like expected (project before the sort)


PLAN=GeodeToEnumerableConverterRel
  *GeodeSortRel*(sort0=[$0], dir0=[ASC])
   GeodeProjectRel(yearPublished=[$2])
   GeodeTableScanRel(table=[[TEST, BookMaster]])

But
​ for sort​
with
​two​
​
columns:

SELECT yearPublished, itemNumber from BookMaster ORDER BY yearPublished
ASC, itemNumber ASC

The
​the plan is:


PLAN=GeodeToEnumerableConverterRel
  GeodeProjectRel(yearPublished=[$2], itemNumber=[$0])
*GeodeSortRel*(sort0=[$2], sort1=[$0], dir0=[ASC], dir1=[ASC])
  GeodeTableScanRel(table=[[TEST, BookMaster]])

I'm not sure i can explain
​ why in the second case the Sort appears before the Project?
Here
​are my cost functions:

​* ​
GeodeSortRel
​:
https://github.com/tzolov/calcite/blob/geode-1.3/geode/src/main/java/org/apache/calcite/adapter/geode/rel/GeodeSortRel.java#L51

* GoedeProjectRel:
https://github.com/tzolov/calcite/blob/4a631d9055340f64f5e644454551f964ea08f9de/geode/src/main/java/org/apache/calcite/adapter/geode/rel/GeodeProjectRel.java#L52
​

​
​Cheers,
Christian​


Re: COUNT returns Integer while Calcite expects Long

2017-11-09 Thread Christian Tzolov
Thanks for the hint Michael,

The problem was in the GeodeUitils#handleStructEntry
​ method. It use to return the results using the original types from the
underlaying systems. This works for the column/field types because the
schema is
inferred from the entity java class but it doesn't for the aggregated
results
​ (at lest for some of them)​
. I made it convert the result in accordance with
​the expected ​
relDataTypeFields
​ and this solved it.
--Chees
​

On 9 November 2017 at 17:28, Michael Mior  wrote:

> Ultimately the result you return from GeodeEnumerator.current needs to have
> the correct type. It's hard to follow exactly what's going on with the
> reflection that's happening, but you probably need to make some changes to
> GeodeUtils.handleJavaObjectEntry.
>
> --
> Michael Mior
> mm...@apache.org
>
> 2017-11-09 11:25 GMT-05:00 Christian Tzolov :
>
> > Regarding
> > > if you're pushing down
> > ​
> > aggregations, you should just make sure that you convert
> > > the result your adapter returns to a Long. (e.g.
> > Long.valueOf(count.longValue()))
> > ​
> > Where to apply the conversion? Inside the GeodeAggregateRel rule?
> >
> > Thanks
> >
> > On 9 November 2017 at 17:21, Christian Tzolov 
> wrote:
> >
> > > Hi Michael,
> > >
> > > You can find current version  in my branch here: https://github.com/
> > > tzolov/calcite/tree/geode-1.3
> > >
> > > I still have not added geode to the calcite-test-dataset so reproduce
> the
> > > problem you need to do this:
> > >
> > > 1. Download geode-sample-bootstrap-0.0.1-SNAPSHOT.jar from:
> > > https://drive.google.com/file/d/0Bw0P8rbcmBaJaGlVZWVEaWE4Tmc
> > > It is a single-node, self-contained SpringBoot app that embeds apache
> > > geode populates some sample data.
> > > Run it like this:
> > > java -Xmx128M -Dgemfire.name=server1 -Dgemfire.server.port=40405
> > > -Dgemfire.jmx-manager-port=1199 -Dgemfire.jmx-manager=true
> > > -Dgemfire.jmx-manager-start=true -Dgemfire.locators=localhost[10334]
> > > -Dgemfire.start-locator=localhost[10334] -Dgemfire.use-cluster-
> > configuration=false
> > > -jar ./geode-sample-bootstrap-0.0.1-SNAPSHOT.jar
> > >
> > > 2. Start the sqlline and connect to Geode using the
> > > geode/src/test/resources/model-rel2.json model:
> > >
> > > sqlline> !connect jdbc:calcite:model=/Users/
> ctzolov/Dev/projects/apache-
> > > calcite-tzolov/geode/src/test/resources/model-rel2.json admin admin
> > >
> > > 0: sqlline> SELECT SUM("retailCost") FROM "BookMaster" GROUP BY
> > > "retailCost";
> > >
> > > ++
> > > | EXPR$0 |
> > > ++
> > > | 59.99  |
> > > | 11.99  |
> > > | 34.99  |
> > > ++
> > >
> > > 0: sqlline> SELECT COUNT(*) FROM "BookMaster";
> > >
> > > java.lang.ClassCastException: java.lang.Integer cannot be cast to
> > > java.lang.Long
> > > at org.apache.calcite.avatica.util.AbstractCursor$
> LongAccessor.getLong(
> > > AbstractCursor.java:550)
> > > at org.apache.calcite.avatica.AvaticaSite.get(AvaticaSite.java:306)
> > > at org.apache.calcite.avatica.AvaticaResultSet.getObject(
> > > AvaticaResultSet.java:409)
> > > at sqlline.Rows$Row.(Rows.java:157)
> > >
> > >
> > > Cheers,
> > > Christian
> > >
> > > On 9 November 2017 at 16:45, Michael Mior  wrote:
> > >
> > >> Christian,
> > >>
> > >> Are you able to share your code for the adapter? This would be helpful
> > in
> > >> understanding the problem. In short however, if you're pushing down
> > >> aggregations, you should just make sure that you convert the result
> your
> > >> adapter returns to a Long. (e.g. Long.valueOf(count.longValue()))
> > >>
> > >> --
> > >> Michael Mior
> > >> mm...@apache.org
> > >>
> > >> 2017-11-09 10:41 GMT-05:00 Christian Tzolov :
> > >>
> > >> > ​(Calcite 1.15.0-SNASHOT and Avalitca 1.10.0)​
> > >> >
> > >> > ​I'm
> > >> >  workin on an adapter for in-memory NoSql
> > >> > ​data ​
> > >> > system
> > >> > ​. The implementations tries to push down all supported aggregation
> > >> > operations, such as AVG, MAX and COUNT. It works for most of them
> b

Re: COUNT returns Integer while Calcite expects Long

2017-11-09 Thread Christian Tzolov
Regarding
> if you're pushing down
​
aggregations, you should just make sure that you convert
> the result your adapter returns to a Long. (e.g.
Long.valueOf(count.longValue()))
​
Where to apply the conversion? Inside the GeodeAggregateRel rule?

Thanks

On 9 November 2017 at 17:21, Christian Tzolov  wrote:

> Hi Michael,
>
> You can find current version  in my branch here: https://github.com/
> tzolov/calcite/tree/geode-1.3
>
> I still have not added geode to the calcite-test-dataset so reproduce the
> problem you need to do this:
>
> 1. Download geode-sample-bootstrap-0.0.1-SNAPSHOT.jar from:
> https://drive.google.com/file/d/0Bw0P8rbcmBaJaGlVZWVEaWE4Tmc
> It is a single-node, self-contained SpringBoot app that embeds apache
> geode populates some sample data.
> Run it like this:
> java -Xmx128M -Dgemfire.name=server1 -Dgemfire.server.port=40405
> -Dgemfire.jmx-manager-port=1199 -Dgemfire.jmx-manager=true
> -Dgemfire.jmx-manager-start=true -Dgemfire.locators=localhost[10334]
> -Dgemfire.start-locator=localhost[10334] 
> -Dgemfire.use-cluster-configuration=false
> -jar ./geode-sample-bootstrap-0.0.1-SNAPSHOT.jar
>
> 2. Start the sqlline and connect to Geode using the
> geode/src/test/resources/model-rel2.json model:
>
> sqlline> !connect jdbc:calcite:model=/Users/ctzolov/Dev/projects/apache-
> calcite-tzolov/geode/src/test/resources/model-rel2.json admin admin
>
> 0: sqlline> SELECT SUM("retailCost") FROM "BookMaster" GROUP BY
> "retailCost";
>
> ++
> | EXPR$0 |
> ++
> | 59.99  |
> | 11.99  |
> | 34.99  |
> ++
>
> 0: sqlline> SELECT COUNT(*) FROM "BookMaster";
>
> java.lang.ClassCastException: java.lang.Integer cannot be cast to
> java.lang.Long
> at org.apache.calcite.avatica.util.AbstractCursor$LongAccessor.getLong(
> AbstractCursor.java:550)
> at org.apache.calcite.avatica.AvaticaSite.get(AvaticaSite.java:306)
> at org.apache.calcite.avatica.AvaticaResultSet.getObject(
> AvaticaResultSet.java:409)
> at sqlline.Rows$Row.(Rows.java:157)
>
>
> Cheers,
> Christian
>
> On 9 November 2017 at 16:45, Michael Mior  wrote:
>
>> Christian,
>>
>> Are you able to share your code for the adapter? This would be helpful in
>> understanding the problem. In short however, if you're pushing down
>> aggregations, you should just make sure that you convert the result your
>> adapter returns to a Long. (e.g. Long.valueOf(count.longValue()))
>>
>> --
>> Michael Mior
>> mm...@apache.org
>>
>> 2017-11-09 10:41 GMT-05:00 Christian Tzolov :
>>
>> > ​(Calcite 1.15.0-SNASHOT and Avalitca 1.10.0)​
>> >
>> > ​I'm
>> >  workin on an adapter for in-memory NoSql
>> > ​data ​
>> > system
>> > ​. The implementations tries to push down all supported aggregation
>> > operations, such as AVG, MAX and COUNT. It works for most of them but
>> fails
>> > for COUNT:
>> >
>> >
>> > 0: jdbc:calcite:model=
>> > ​...
>> > > SELECT COUNT(*) FROM "BookMaster";
>> >
>> > java.lang.ClassCastException: java.lang.Integer cannot be cast to
>> > java.lang.Long
>> > at
>> > org.apache.calcite.avatica.util.AbstractCursor$LongAccessor.getLong(
>> > AbstractCursor.java:550)
>> > at org.apache.calcite.avatica.AvaticaSite.get(AvaticaSite.java:306)
>> > at
>> > org.apache.calcite.avatica.AvaticaResultSet.getObject(
>> > AvaticaResultSet.java:409)
>> > at sqlline.Rows$Row.(Rows.java:157)
>> >
>> > ​P
>> > roblem IMO is that the
>> > ​COUNT result form the ​
>> > underlaying system
>> > ​is of type
>> >  Integer while
>> > ​C
>> > alcite expects Long
>> > ​ (e.g. BIGINT) as
>> > hardcoded in SqlCountAggFunction
>> > ​.java.
>> >
>> > ​I can't alter the ​SqlCountAggFunction so I wonder what is the right
>> place
>> > in Calcite to implement/configure this type conversion? ​
>> >
>> > ​This issue looks very similar (
>> > https://issues.apache.org/jira/browse/CALCITE-665)​ but i am not sure i
>> > completely understand the resolution?
>> >
>> > Thanks,
>> > Christian​
>> >
>>
>
>
>
> --
> Christian Tzolov <http://www.linkedin.com/in/tzolov> | Principle Software
> Engineer | Pivotal <http://pivotal.io/> | ctzo...@pivotal.io |+31610285517
>



-- 
Christian Tzolov <http://www.linkedin.com/in/tzolov> | Principle Software
Engineer | Pivotal <http://pivotal.io/> | ctzo...@pivotal.io |+31610285517


Re: COUNT returns Integer while Calcite expects Long

2017-11-09 Thread Christian Tzolov
Hi Michael,

You can find current version  in my branch here:
https://github.com/tzolov/calcite/tree/geode-1.3

I still have not added geode to the calcite-test-dataset so reproduce the
problem you need to do this:

1. Download geode-sample-bootstrap-0.0.1-SNAPSHOT.jar from:
https://drive.google.com/file/d/0Bw0P8rbcmBaJaGlVZWVEaWE4Tmc
It is a single-node, self-contained SpringBoot app that embeds apache geode
populates some sample data.
Run it like this:
java -Xmx128M -Dgemfire.name=server1 -Dgemfire.server.port=40405
-Dgemfire.jmx-manager-port=1199 -Dgemfire.jmx-manager=true
-Dgemfire.jmx-manager-start=true -Dgemfire.locators=localhost[10334]
-Dgemfire.start-locator=localhost[10334]
-Dgemfire.use-cluster-configuration=false -jar
./geode-sample-bootstrap-0.0.1-SNAPSHOT.jar

2. Start the sqlline and connect to Geode using the
geode/src/test/resources/model-rel2.json model:

sqlline> !connect
jdbc:calcite:model=/Users/ctzolov/Dev/projects/apache-calcite-tzolov/geode/src/test/resources/model-rel2.json
admin admin

0: sqlline> SELECT SUM("retailCost") FROM "BookMaster" GROUP BY
"retailCost";

++
| EXPR$0 |
++
| 59.99  |
| 11.99  |
| 34.99  |
++

0: sqlline> SELECT COUNT(*) FROM "BookMaster";

java.lang.ClassCastException: java.lang.Integer cannot be cast to
java.lang.Long
at
org.apache.calcite.avatica.util.AbstractCursor$LongAccessor.getLong(AbstractCursor.java:550)
at org.apache.calcite.avatica.AvaticaSite.get(AvaticaSite.java:306)
at
org.apache.calcite.avatica.AvaticaResultSet.getObject(AvaticaResultSet.java:409)
at sqlline.Rows$Row.(Rows.java:157)


Cheers,
Christian

On 9 November 2017 at 16:45, Michael Mior  wrote:

> Christian,
>
> Are you able to share your code for the adapter? This would be helpful in
> understanding the problem. In short however, if you're pushing down
> aggregations, you should just make sure that you convert the result your
> adapter returns to a Long. (e.g. Long.valueOf(count.longValue()))
>
> --
> Michael Mior
> mm...@apache.org
>
> 2017-11-09 10:41 GMT-05:00 Christian Tzolov :
>
> > ​(Calcite 1.15.0-SNASHOT and Avalitca 1.10.0)​
> >
> > ​I'm
> >  workin on an adapter for in-memory NoSql
> > ​data ​
> > system
> > ​. The implementations tries to push down all supported aggregation
> > operations, such as AVG, MAX and COUNT. It works for most of them but
> fails
> > for COUNT:
> >
> >
> > 0: jdbc:calcite:model=
> > ​...
> > > SELECT COUNT(*) FROM "BookMaster";
> >
> > java.lang.ClassCastException: java.lang.Integer cannot be cast to
> > java.lang.Long
> > at
> > org.apache.calcite.avatica.util.AbstractCursor$LongAccessor.getLong(
> > AbstractCursor.java:550)
> > at org.apache.calcite.avatica.AvaticaSite.get(AvaticaSite.java:306)
> > at
> > org.apache.calcite.avatica.AvaticaResultSet.getObject(
> > AvaticaResultSet.java:409)
> > at sqlline.Rows$Row.(Rows.java:157)
> >
> > ​P
> > roblem IMO is that the
> > ​COUNT result form the ​
> > underlaying system
> > ​is of type
> >  Integer while
> > ​C
> > alcite expects Long
> > ​ (e.g. BIGINT) as
> > hardcoded in SqlCountAggFunction
> > ​.java.
> >
> > ​I can't alter the ​SqlCountAggFunction so I wonder what is the right
> place
> > in Calcite to implement/configure this type conversion? ​
> >
> > ​This issue looks very similar (
> > https://issues.apache.org/jira/browse/CALCITE-665)​ but i am not sure i
> > completely understand the resolution?
> >
> > Thanks,
> > Christian​
> >
>



-- 
Christian Tzolov <http://www.linkedin.com/in/tzolov> | Principle Software
Engineer | Pivotal <http://pivotal.io/> | ctzo...@pivotal.io |+31610285517


COUNT returns Integer while Calcite expects Long

2017-11-09 Thread Christian Tzolov
​(Calcite 1.15.0-SNASHOT and Avalitca 1.10.0)​

​I'm
 workin on an adapter for in-memory NoSql
​data ​
system
​. The implementations tries to push down all supported aggregation
operations, such as AVG, MAX and COUNT. It works for most of them but fails
for COUNT:


0: jdbc:calcite:model=
​...
> SELECT COUNT(*) FROM "BookMaster";

java.lang.ClassCastException: java.lang.Integer cannot be cast to
java.lang.Long
at
org.apache.calcite.avatica.util.AbstractCursor$LongAccessor.getLong(AbstractCursor.java:550)
at org.apache.calcite.avatica.AvaticaSite.get(AvaticaSite.java:306)
at
org.apache.calcite.avatica.AvaticaResultSet.getObject(AvaticaResultSet.java:409)
at sqlline.Rows$Row.(Rows.java:157)

​P
roblem IMO is that the
​COUNT result form the ​
underlaying system
​is of type
 Integer while
​C
alcite expects Long
​ (e.g. BIGINT) as
hardcoded in SqlCountAggFunction
​.java.

​I can't alter the ​SqlCountAggFunction so I wonder what is the right place
in Calcite to implement/configure this type conversion? ​

​This issue looks very similar (
https://issues.apache.org/jira/browse/CALCITE-665)​ but i am not sure i
completely understand the resolution?

Thanks,
Christian​


Re: INSERT with the JdbcAdapter

2016-12-05 Thread Christian Tzolov
Thanks for the feedback. I will review the comments in the ticket

Source expressions with sub-query are hard. Maybe you can represent them
> using RexSubQuery; and maybe RelDecorrelator can help; I’m not sure. In any
> case, I think you should defer them to another case.
>

Indeed it will be nice if we can defer the update-sub-query into separate
ticket as it seem it would requires more digging. Shall i submit a new
dedicated ticket for this or shall we postpone it for after committing
current work?

In this case, it seems that you have implemented INSERT … VALUES but not
> INSERT … SELECT. Can you clarify whether you intend to do the latter?
>

Yes I have implemented already the INSERT ... SELECT along with support for
INSERT VALUES (ROW1, ROW2 ..). Will commit it in the next hour.

Cheers,
Christian


> Julian
>
>
> > On Dec 5, 2016, at 2:21 AM, Christian Tzolov  wrote:
> >
> > Here it goes :)
> > - JIRA: https://issues.apache.org/jira/browse/CALCITE-1527
> > - PR:  https://github.com/apache/calcite/pull/334
> >
> > Seems bit more involving than expected. In addition to the Sql
> > implementation, it requires changes on the SQL parsing side as well as
> the
> > handling for the non-ResultSet responses. There are still some open
> > questions like the handling of source expressions for UPDATE with
> > sub-query.
> >
> >
> >
> >
> > On 30 November 2016 at 23:15, Julian Hyde  wrote:
> >
> >> “Unfortunate” is one word for it. If Calcite were complete it would be
> >> considerably more expensive. :)
> >>
> >> You figured out how to implement OVER, so I’d look in a similar place
> for
> >> DML.
> >>
> >> Julian
> >>
> >>> On Nov 30, 2016, at 1:42 PM, Christian Tzolov 
> >> wrote:
> >>>
> >>> Thanks Julian, this is unfortunate as it undermines the idea of having
> >> jdbc
> >>> wrapper in front of HAWQ.
> >>>
> >>> I will log Jira tickets. How difficult do you think would be to provide
> >> DML
> >>> support for the JDBC adapter? If i am to take a look at it where should
> >>> look first?
> >>>
> >>> Cheers,
> >>> Christian
> >>>
> >>> On 30 November 2016 at 19:23, Julian Hyde  wrote:
> >>>
> >>>> It’s a missing feature. The JDBC adapter does not currently do DML.
> Can
> >>>> you please log a JIRA case to track.
> >>>>
> >>>> Julian
> >>>>
> >>>>
> >>>>> On Nov 30, 2016, at 7:56 AM, Christian Tzolov 
> >>>> wrote:
> >>>>>
> >>>>> A test to reproduce the problem:
> >>>>>
> >>>>> @Test public void testJdbcAdapterInsert() {
> >>>>>
> >>>>> CalciteAssert.model(JdbcTest.FOODMART_MODEL)
> >>>>>.enable(CalciteAssert.DB == POSTGRESQL)
> >>>>>.query("INSERT INTO \"foodmart\".\"expense_fact\"(\n" +
> >>>>>  " \"store_id\", \"account_id\", \"exp_date\",
> >>>>> \"time_id\"," +
> >>>>>  " \"category_id\", \"currency_id\",
> >>>> \"amount\")\n" +
> >>>>> " VALUES (666, 666, TIMESTAMP '1997-01-01 00:00:00', 666,
> >>>> '666',
> >>>>> 666, 666)")
> >>>>>.runs();
> >>>>> }
> >>>>>
> >>>>> Run with  -Dcalcite.test.db=postgresql
> >>>>>
> >>>>>
> >>>>> On 30 November 2016 at 15:37, Christian Tzolov 
> >>>> wrote:
> >>>>>
> >>>>>> Hi,
> >>>>>>
> >>>>>> The insert statement via the JdbcAdapter fails with
> >>>>>> "rel#78:Subset#1.ENUMERABLE.[]] could not be implemented;"
> >>>>>>
> >>>>>> I'm testing SQL inserts using the foodmart dataset and postgresql
> >>>>>> configured as a backend.
> >>>>>>
> >>>>>> Following insert works find on postgresql but fails when run through
> >> the
> >>>>>> jdbc adapter:
> >>>>>>
> >>>>>> "INSERT INTO "foodmart"."expense_fact"("store_id", "account_id",
> >>>>>> "exp_date", "time_id", "category_id", "currency_id", "amount")
> VALUES
> >>>> (666,
> >>>>>> 666, TIMESTAMP '1997-01-01 00:00:00', 666, '666', 666, 666);
> >>>>>>
> >>>>>> The jdbc-adapter error:
> >>>>>>
> >>>>>> Error: Error while executing SQL "INSERT INTO
> >>>> "foodmart"."expense_fact"("store_id",
> >>>>>> "account_id", "exp_date", "time_id", "category_id", "currency_id",
> >>>>>> "amount") VALUES (666, 666, TIMESTAMP '1997-01-01 00:00:00', 666,
> >> '666',
> >>>>>> 666, 666)": Node [rel#78:Subset#1.ENUMERABLE.[]] could not be
> >>>>>> implemented; planner state:
> >>>>>>
> >>>>>> It looks like error occurs before the JdbcTableModificationRule#
> >> covert
> >>>> is
> >>>>>> reached.
> >>>>>>
> >>>>>> Is this a limitation or bug?
> >>>>>>
> >>>>>> Thanks,
> >>>>>> Christian
> >>>>>>
> >>>>>
> >>>>
> >>>>
> >>
> >>
> >
> >
> > --
> > Christian Tzolov <http://www.linkedin.com/in/tzolov> | Solution
> Architect,
> > EMEA Practice Team | Pivotal <http://pivotal.io/>
> > ctzo...@pivotal.io|+31610285517
>
>


-- 
Christian Tzolov <http://www.linkedin.com/in/tzolov> | Solution Architect,
EMEA Practice Team | Pivotal <http://pivotal.io/>
ctzo...@pivotal.io|+31610285517


Re: INSERT with the JdbcAdapter

2016-12-05 Thread Christian Tzolov
Here it goes :)
- JIRA: https://issues.apache.org/jira/browse/CALCITE-1527
- PR:  https://github.com/apache/calcite/pull/334

Seems bit more involving than expected. In addition to the Sql
implementation, it requires changes on the SQL parsing side as well as the
handling for the non-ResultSet responses. There are still some open
questions like the handling of source expressions for UPDATE with
sub-query.




On 30 November 2016 at 23:15, Julian Hyde  wrote:

> “Unfortunate” is one word for it. If Calcite were complete it would be
> considerably more expensive. :)
>
> You figured out how to implement OVER, so I’d look in a similar place for
> DML.
>
> Julian
>
> > On Nov 30, 2016, at 1:42 PM, Christian Tzolov 
> wrote:
> >
> > Thanks Julian, this is unfortunate as it undermines the idea of having
> jdbc
> > wrapper in front of HAWQ.
> >
> > I will log Jira tickets. How difficult do you think would be to provide
> DML
> > support for the JDBC adapter? If i am to take a look at it where should
> > look first?
> >
> > Cheers,
> > Christian
> >
> > On 30 November 2016 at 19:23, Julian Hyde  wrote:
> >
> >> It’s a missing feature. The JDBC adapter does not currently do DML. Can
> >> you please log a JIRA case to track.
> >>
> >> Julian
> >>
> >>
> >>> On Nov 30, 2016, at 7:56 AM, Christian Tzolov 
> >> wrote:
> >>>
> >>> A test to reproduce the problem:
> >>>
> >>> @Test public void testJdbcAdapterInsert() {
> >>>
> >>>  CalciteAssert.model(JdbcTest.FOODMART_MODEL)
> >>> .enable(CalciteAssert.DB == POSTGRESQL)
> >>> .query("INSERT INTO \"foodmart\".\"expense_fact\"(\n" +
> >>>   " \"store_id\", \"account_id\", \"exp_date\",
> >>> \"time_id\"," +
> >>>   " \"category_id\", \"currency_id\",
> >> \"amount\")\n" +
> >>>  " VALUES (666, 666, TIMESTAMP '1997-01-01 00:00:00', 666,
> >> '666',
> >>> 666, 666)")
> >>> .runs();
> >>> }
> >>>
> >>> Run with  -Dcalcite.test.db=postgresql
> >>>
> >>>
> >>> On 30 November 2016 at 15:37, Christian Tzolov 
> >> wrote:
> >>>
> >>>> Hi,
> >>>>
> >>>> The insert statement via the JdbcAdapter fails with
> >>>> "rel#78:Subset#1.ENUMERABLE.[]] could not be implemented;"
> >>>>
> >>>> I'm testing SQL inserts using the foodmart dataset and postgresql
> >>>> configured as a backend.
> >>>>
> >>>> Following insert works find on postgresql but fails when run through
> the
> >>>> jdbc adapter:
> >>>>
> >>>> "INSERT INTO "foodmart"."expense_fact"("store_id", "account_id",
> >>>> "exp_date", "time_id", "category_id", "currency_id", "amount") VALUES
> >> (666,
> >>>> 666, TIMESTAMP '1997-01-01 00:00:00', 666, '666', 666, 666);
> >>>>
> >>>> The jdbc-adapter error:
> >>>>
> >>>> Error: Error while executing SQL "INSERT INTO
> >> "foodmart"."expense_fact"("store_id",
> >>>> "account_id", "exp_date", "time_id", "category_id", "currency_id",
> >>>> "amount") VALUES (666, 666, TIMESTAMP '1997-01-01 00:00:00', 666,
> '666',
> >>>> 666, 666)": Node [rel#78:Subset#1.ENUMERABLE.[]] could not be
> >>>> implemented; planner state:
> >>>>
> >>>> It looks like error occurs before the JdbcTableModificationRule#
> covert
> >> is
> >>>> reached.
> >>>>
> >>>> Is this a limitation or bug?
> >>>>
> >>>> Thanks,
> >>>> Christian
> >>>>
> >>>
> >>
> >>
>
>


-- 
Christian Tzolov <http://www.linkedin.com/in/tzolov> | Solution Architect,
EMEA Practice Team | Pivotal <http://pivotal.io/>
ctzo...@pivotal.io|+31610285517


[jira] [Created] (CALCITE-1527) Add DML support in the JDBC adapter

2016-12-04 Thread Christian Tzolov (JIRA)
Christian Tzolov created CALCITE-1527:
-

 Summary: Add DML support in the JDBC adapter
 Key: CALCITE-1527
 URL: https://issues.apache.org/jira/browse/CALCITE-1527
 Project: Calcite
  Issue Type: Bug
  Components: avatica, core, jdbc-adapter
Reporter: Christian Tzolov


Currently the JDBC adapter does not support the DML operations: *INSERT*, 
*DELETE* and  *UPDATE*.

Solution needs to convert the parsed *Modify* and *Values* RelNodes into 
*JdbcTableModify*, *JdbcValues* ... such and then in turn into corresponding 
SqlInsert, SqlUpdate and SqlDelete.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: INSERT with the JdbcAdapter

2016-11-30 Thread Christian Tzolov
Thanks Julian, this is unfortunate as it undermines the idea of having jdbc
wrapper in front of HAWQ.

I will log Jira tickets. How difficult do you think would be to provide DML
support for the JDBC adapter? If i am to take a look at it where should
look first?

Cheers,
Christian

On 30 November 2016 at 19:23, Julian Hyde  wrote:

> It’s a missing feature. The JDBC adapter does not currently do DML. Can
> you please log a JIRA case to track.
>
> Julian
>
>
> > On Nov 30, 2016, at 7:56 AM, Christian Tzolov 
> wrote:
> >
> > A test to reproduce the problem:
> >
> > @Test public void testJdbcAdapterInsert() {
> >
> >   CalciteAssert.model(JdbcTest.FOODMART_MODEL)
> >  .enable(CalciteAssert.DB == POSTGRESQL)
> >  .query("INSERT INTO \"foodmart\".\"expense_fact\"(\n" +
> >" \"store_id\", \"account_id\", \"exp_date\",
> > \"time_id\"," +
> >" \"category_id\", \"currency_id\",
> \"amount\")\n" +
> >   " VALUES (666, 666, TIMESTAMP '1997-01-01 00:00:00', 666,
> '666',
> > 666, 666)")
> >  .runs();
> > }
> >
> > Run with  -Dcalcite.test.db=postgresql
> >
> >
> > On 30 November 2016 at 15:37, Christian Tzolov 
> wrote:
> >
> >> Hi,
> >>
> >> The insert statement via the JdbcAdapter fails with
> >> "rel#78:Subset#1.ENUMERABLE.[]] could not be implemented;"
> >>
> >> I'm testing SQL inserts using the foodmart dataset and postgresql
> >> configured as a backend.
> >>
> >> Following insert works find on postgresql but fails when run through the
> >> jdbc adapter:
> >>
> >> "INSERT INTO "foodmart"."expense_fact"("store_id", "account_id",
> >> "exp_date", "time_id", "category_id", "currency_id", "amount") VALUES
> (666,
> >> 666, TIMESTAMP '1997-01-01 00:00:00', 666, '666', 666, 666);
> >>
> >> The jdbc-adapter error:
> >>
> >> Error: Error while executing SQL "INSERT INTO
> "foodmart"."expense_fact"("store_id",
> >> "account_id", "exp_date", "time_id", "category_id", "currency_id",
> >> "amount") VALUES (666, 666, TIMESTAMP '1997-01-01 00:00:00', 666, '666',
> >> 666, 666)": Node [rel#78:Subset#1.ENUMERABLE.[]] could not be
> >> implemented; planner state:
> >>
> >> It looks like error occurs before the JdbcTableModificationRule#covert
> is
> >> reached.
> >>
> >> Is this a limitation or bug?
> >>
> >> Thanks,
> >> Christian
> >>
> >
>
>


Re: Rule to expand tha JdbcTableScan expression

2016-11-30 Thread Christian Tzolov
Thanks,

Indeed by playing with (and overriding) the cost of the Jdbc rel nodes i'm
managed to expand the trees (in many cases).

How can i set the HepPlanner instead of Volcano one?



On 30 November 2016 at 20:50, jordan.halter...@gmail.com <
jordan.halter...@gmail.com> wrote:

> Which planner are you using? If the rule is being fired, what you may be
> missing is that the cost of the converted expression is more than the cost
> of the input expression, resulting in the VolcanoPlanner throwing out the
> converted expression. You should use the HepPlanner for this.
>
> > On Nov 29, 2016, at 2:07 PM, Christian Tzolov 
> wrote:
> >
> > We're working on a JDBC wrapper for HAWQ (SQL-on-HADOOP) that hopefully
> > would allow us "emulate" UPDATE and DELETE operations while using only
> > append-only (e.g. INSERT) in the background.
> >
> > Using the JDBC adapter i've been trying to convert an input sql query
> like
> > this:
> >
> > SELECT store_id, account_id, exp_date, time_id, category_id, currency_id,
> > amount  FROM foodmart.expense_fact
> >
> > Into one that looks like this:
> >
> > SELECT store_id, account_id, exp_date, time_id, category_id, currency_id,
> > amount
> > FROM (
> >SELECT store_id, account_id, exp_date, time_id, category_id,
> > currency_id, amount
> >FROM (
> >SELECT *, MAX(exp_date) OVER (PARTITION BY account_id) AS
> > last_version_number
> >FROM foodmart.expense_fact
> > ) as link_last
> >WHERE exp_date = last_version_number
> > ) as current_version;
> >
> > If you run the second query directly the output relation is:
> >
> > "PLAN=JdbcToEnumerableConverter
> >  JdbcProject(store_id=[$0], account_id=[$1], exp_date=[$2], time_id=[$3],
> > category_id=[$4], currency_id=[$5], amount=[$6])
> >JdbcFilter(condition=[=($2, $7)])
> >  JdbcProject(store_id=[$0], account_id=[$1], exp_date=[$2],
> > time_id=[$3], category_id=[$4], currency_id=[$5], amount=[$6],
> >  last_version_number=[MAX($2) OVER (PARTITION BY $1 RANGE
> > BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)])
> >JdbcTableScan(table=[[foodmart, expense_fact]])
> >
> >
> > I've created a rule meant to do this:
> > https://gist.github.com/tzolov/a426d10a55fdd712a67830b0154b33db   On
> match
> > it creates an expression that matches the one above:
> >
> > "Expanded RelNode:
> > JdbcProject(store_id=[$0], account_id=[$1], exp_date=[$2], time_id=[$3],
> > category_id=[$4], currency_id=[$5], amount=[$6])
> >  JdbcFilter(condition=[=($2, $7)])
> >JdbcProject(store_id=[$0], account_id=[$1], exp_date=[$2],
> > time_id=[$3], category_id=[$4], currency_id=[$5], amount=[$6],
> > last_version_number=[MAX($2) OVER (PARTITION BY $1 ROWS BETWEEN UNBOUNDED
> > PRECEDING AND UNBOUNDED FOLLOWING)])
> >  JdbcTableScan(table=[[foodmart, expense_fact]])"
> >
> > But after the Planner completes the optimization the final output is this
> > (e.g. the initial query):
> >
> > "PLAN=JdbcToEnumerableConverter
> >  JdbcProject(store_id=[$0], account_id=[$1], exp_date=[$2], time_id=[$3],
> > category_id=[$4], currency_id=[$5])
> >JdbcTableScan(table=[[foodmart, expense_fact]])
> >
> > 1. What am i missing? Are the field/type references copied correctly?
> >
> > 2. Also what is the right approach to prevent this Rule get in loop?
> >
> > I'm a bit stuck so any ideas and suggestions appreciated!
> >
> > Cheers,
> > Christian
> >
> > --
> > Christian Tzolov <http://www.linkedin.com/in/tzolov> | Solution
> Architect,
> > EMEA Practice Team | Pivotal <http://pivotal.io/>
> > ctzo...@pivotal.io|+31610285517
>



-- 
Christian Tzolov <http://www.linkedin.com/in/tzolov> | Solution Architect,
EMEA Practice Team | Pivotal <http://pivotal.io/>
ctzo...@pivotal.io|+31610285517


Re: INSERT with the JdbcAdapter

2016-11-30 Thread Christian Tzolov
A test to reproduce the problem:

@Test public void testJdbcAdapterInsert() {

   CalciteAssert.model(JdbcTest.FOODMART_MODEL)
  .enable(CalciteAssert.DB == POSTGRESQL)
  .query("INSERT INTO \"foodmart\".\"expense_fact\"(\n" +
" \"store_id\", \"account_id\", \"exp_date\",
\"time_id\"," +
" \"category_id\", \"currency_id\", \"amount\")\n" +
   " VALUES (666, 666, TIMESTAMP '1997-01-01 00:00:00', 666, '666',
666, 666)")
  .runs();
}

Run with  -Dcalcite.test.db=postgresql


On 30 November 2016 at 15:37, Christian Tzolov  wrote:

> Hi,
>
> The insert statement via the JdbcAdapter fails with
> "rel#78:Subset#1.ENUMERABLE.[]] could not be implemented;"
>
> I'm testing SQL inserts using the foodmart dataset and postgresql
> configured as a backend.
>
> Following insert works find on postgresql but fails when run through the
> jdbc adapter:
>
> "INSERT INTO "foodmart"."expense_fact"("store_id", "account_id",
> "exp_date", "time_id", "category_id", "currency_id", "amount") VALUES (666,
> 666, TIMESTAMP '1997-01-01 00:00:00', 666, '666', 666, 666);
>
> The jdbc-adapter error:
>
> Error: Error while executing SQL "INSERT INTO 
> "foodmart"."expense_fact"("store_id",
> "account_id", "exp_date", "time_id", "category_id", "currency_id",
> "amount") VALUES (666, 666, TIMESTAMP '1997-01-01 00:00:00', 666, '666',
> 666, 666)": Node [rel#78:Subset#1.ENUMERABLE.[]] could not be
> implemented; planner state:
>
> It looks like error occurs before the JdbcTableModificationRule#covert is
> reached.
>
> Is this a limitation or bug?
>
> Thanks,
> Christian
>
>
> --
> Christian Tzolov <http://www.linkedin.com/in/tzolov> | Solution
> Architect, EMEA Practice Team | Pivotal <http://pivotal.io/>
> ctzo...@pivotal.io|+31610285517
>



-- 
Christian Tzolov <http://www.linkedin.com/in/tzolov> | Solution Architect,
EMEA Practice Team | Pivotal <http://pivotal.io/>
ctzo...@pivotal.io|+31610285517


INSERT with the JdbcAdapter

2016-11-30 Thread Christian Tzolov
Hi,

The insert statement via the JdbcAdapter fails with
"rel#78:Subset#1.ENUMERABLE.[]] could not be implemented;"

I'm testing SQL inserts using the foodmart dataset and postgresql
configured as a backend.

Following insert works find on postgresql but fails when run through the
jdbc adapter:

"INSERT INTO "foodmart"."expense_fact"("store_id", "account_id",
"exp_date", "time_id", "category_id", "currency_id", "amount") VALUES (666,
666, TIMESTAMP '1997-01-01 00:00:00', 666, '666', 666, 666);

The jdbc-adapter error:

Error: Error while executing SQL "INSERT INTO
"foodmart"."expense_fact"("store_id", "account_id", "exp_date", "time_id",
"category_id", "currency_id", "amount") VALUES (666, 666, TIMESTAMP
'1997-01-01 00:00:00', 666, '666', 666, 666)": Node
[rel#78:Subset#1.ENUMERABLE.[]] could not be implemented; planner state:

It looks like error occurs before the JdbcTableModificationRule#covert is
reached.

Is this a limitation or bug?

Thanks,
Christian


-- 
Christian Tzolov <http://www.linkedin.com/in/tzolov> | Solution Architect,
EMEA Practice Team | Pivotal <http://pivotal.io/>
ctzo...@pivotal.io|+31610285517


Re: Rule to expand tha JdbcTableScan expression

2016-11-29 Thread Christian Tzolov
Thanks for the reaction Julian!

I'm aware of the "semantically equivalent" expressions but our earlier
converstation (http://bit.ly/2gTi3ZE) have left me with the impression the
you can "abuse" this requirement.

Do you think Calcite shouldn't be used for such use case?  Mind that in
addition to the select example above i would need to implement rules (or
visitors) that overrides and convert SQL UPDATEs into INSERTs.

Can you please elaborate on how i can plug my own RelVisitor impl? Is there
any sample code (don't dare asking of docs ;).
Also i'm afraid i'm completely unfamiliar with the Marco (i guess it is
TableMacro)


Thanks,
Christian



On 30 November 2016 at 00:47, Julian Hyde  wrote:

> What you've written isn't really a planner rule. A planner rule is
> supposed to create something semantically equivalent. And the planner
> then chooses the cheapest of those equivalent expressions.
>
> In contrast, you want to do search-and-replace on the tree. I think
> you should use a RelVisitor, or perhaps replace TableScan with a table
> macro and let the macro expander do its stuff.
>
> (There are other cases where planner rules don't cut it. One is field
> trimming, because when you visit each node, you want to return a node
> that has fewer fields.)
>
> On Tue, Nov 29, 2016 at 2:07 PM, Christian Tzolov 
> wrote:
> > We're working on a JDBC wrapper for HAWQ (SQL-on-HADOOP) that hopefully
> > would allow us "emulate" UPDATE and DELETE operations while using only
> > append-only (e.g. INSERT) in the background.
> >
> > Using the JDBC adapter i've been trying to convert an input sql query
> like
> > this:
> >
> > SELECT store_id, account_id, exp_date, time_id, category_id, currency_id,
> > amount  FROM foodmart.expense_fact
> >
> > Into one that looks like this:
> >
> > SELECT store_id, account_id, exp_date, time_id, category_id, currency_id,
> > amount
> > FROM (
> > SELECT store_id, account_id, exp_date, time_id, category_id,
> > currency_id, amount
> > FROM (
> > SELECT *, MAX(exp_date) OVER (PARTITION BY account_id) AS
> > last_version_number
> > FROM foodmart.expense_fact
> >  ) as link_last
> > WHERE exp_date = last_version_number
> > ) as current_version;
> >
> > If you run the second query directly the output relation is:
> >
> > "PLAN=JdbcToEnumerableConverter
> >   JdbcProject(store_id=[$0], account_id=[$1], exp_date=[$2],
> time_id=[$3],
> > category_id=[$4], currency_id=[$5], amount=[$6])
> > JdbcFilter(condition=[=($2, $7)])
> >   JdbcProject(store_id=[$0], account_id=[$1], exp_date=[$2],
> > time_id=[$3], category_id=[$4], currency_id=[$5], amount=[$6],
> >   last_version_number=[MAX($2) OVER (PARTITION BY $1
> RANGE
> > BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)])
> > JdbcTableScan(table=[[foodmart, expense_fact]])
> >
> >
> > I've created a rule meant to do this:
> > https://gist.github.com/tzolov/a426d10a55fdd712a67830b0154b33db   On
> match
> > it creates an expression that matches the one above:
> >
> > "Expanded RelNode:
> > JdbcProject(store_id=[$0], account_id=[$1], exp_date=[$2], time_id=[$3],
> > category_id=[$4], currency_id=[$5], amount=[$6])
> >   JdbcFilter(condition=[=($2, $7)])
> > JdbcProject(store_id=[$0], account_id=[$1], exp_date=[$2],
> > time_id=[$3], category_id=[$4], currency_id=[$5], amount=[$6],
> > last_version_number=[MAX($2) OVER (PARTITION BY $1 ROWS BETWEEN UNBOUNDED
> > PRECEDING AND UNBOUNDED FOLLOWING)])
> >   JdbcTableScan(table=[[foodmart, expense_fact]])"
> >
> > But after the Planner completes the optimization the final output is this
> > (e.g. the initial query):
> >
> > "PLAN=JdbcToEnumerableConverter
> >   JdbcProject(store_id=[$0], account_id=[$1], exp_date=[$2],
> time_id=[$3],
> > category_id=[$4], currency_id=[$5])
> > JdbcTableScan(table=[[foodmart, expense_fact]])
> >
> > 1. What am i missing? Are the field/type references copied correctly?
> >
> > 2. Also what is the right approach to prevent this Rule get in loop?
> >
> > I'm a bit stuck so any ideas and suggestions appreciated!
> >
> > Cheers,
> > Christian
> >
> > --
> > Christian Tzolov <http://www.linkedin.com/in/tzolov> | Solution
> Architect,
> > EMEA Practice Team | Pivotal <http://pivotal.io/>
> > ctzo...@pivotal.io|+31610285517
>



-- 
Christian Tzolov <http://www.linkedin.com/in/tzolov> | Solution Architect,
EMEA Practice Team | Pivotal <http://pivotal.io/>
ctzo...@pivotal.io|+31610285517


Rule to expand tha JdbcTableScan expression

2016-11-29 Thread Christian Tzolov
We're working on a JDBC wrapper for HAWQ (SQL-on-HADOOP) that hopefully
would allow us "emulate" UPDATE and DELETE operations while using only
append-only (e.g. INSERT) in the background.

Using the JDBC adapter i've been trying to convert an input sql query like
this:

SELECT store_id, account_id, exp_date, time_id, category_id, currency_id,
amount  FROM foodmart.expense_fact

Into one that looks like this:

SELECT store_id, account_id, exp_date, time_id, category_id, currency_id,
amount
FROM (
SELECT store_id, account_id, exp_date, time_id, category_id,
currency_id, amount
FROM (
SELECT *, MAX(exp_date) OVER (PARTITION BY account_id) AS
last_version_number
FROM foodmart.expense_fact
 ) as link_last
WHERE exp_date = last_version_number
) as current_version;

If you run the second query directly the output relation is:

"PLAN=JdbcToEnumerableConverter
  JdbcProject(store_id=[$0], account_id=[$1], exp_date=[$2], time_id=[$3],
category_id=[$4], currency_id=[$5], amount=[$6])
JdbcFilter(condition=[=($2, $7)])
  JdbcProject(store_id=[$0], account_id=[$1], exp_date=[$2],
time_id=[$3], category_id=[$4], currency_id=[$5], amount=[$6],
  last_version_number=[MAX($2) OVER (PARTITION BY $1 RANGE
BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)])
JdbcTableScan(table=[[foodmart, expense_fact]])


I've created a rule meant to do this:
https://gist.github.com/tzolov/a426d10a55fdd712a67830b0154b33db   On match
it creates an expression that matches the one above:

"Expanded RelNode:
JdbcProject(store_id=[$0], account_id=[$1], exp_date=[$2], time_id=[$3],
category_id=[$4], currency_id=[$5], amount=[$6])
  JdbcFilter(condition=[=($2, $7)])
JdbcProject(store_id=[$0], account_id=[$1], exp_date=[$2],
time_id=[$3], category_id=[$4], currency_id=[$5], amount=[$6],
last_version_number=[MAX($2) OVER (PARTITION BY $1 ROWS BETWEEN UNBOUNDED
PRECEDING AND UNBOUNDED FOLLOWING)])
  JdbcTableScan(table=[[foodmart, expense_fact]])"

But after the Planner completes the optimization the final output is this
(e.g. the initial query):

"PLAN=JdbcToEnumerableConverter
  JdbcProject(store_id=[$0], account_id=[$1], exp_date=[$2], time_id=[$3],
category_id=[$4], currency_id=[$5])
JdbcTableScan(table=[[foodmart, expense_fact]])

1. What am i missing? Are the field/type references copied correctly?

2. Also what is the right approach to prevent this Rule get in loop?

I'm a bit stuck so any ideas and suggestions appreciated!

Cheers,
Christian

-- 
Christian Tzolov <http://www.linkedin.com/in/tzolov> | Solution Architect,
EMEA Practice Team | Pivotal <http://pivotal.io/>
ctzo...@pivotal.io|+31610285517


Re: SqlImplementer doesn't handle RexOver when converting Rel into SQL

2016-11-24 Thread Christian Tzolov
Here is the jira ticket: https://issues.apache.org/jira/browse/CALCITE-1506
and the PR: https://github.com/apache/calcite/pull/329

The PR should cover most cases but i'm not sure how to convert some range
bounds from Rel to SQL. Would appreciate if you can review it.

Cheers,
Christian



On 23 November 2016 at 04:59, Julian Hyde  wrote:

> Yes, you won’t need to add “visit(Window)”. Just make SqlImplementor.toSql
> handle RexOver within its expressions.
>
> > On Nov 22, 2016, at 12:02 PM, Christian Tzolov 
> wrote:
> >
> > Thanks Julian,
> >
> > I will open JIRA ticket and will add test for postgres dialect. Regarding
> > your question:
> >
> >> I don’t know whether you’re seeing a Project where some of the
> > expressions are RexOver, or whether you’re seeing a Window. If the
> latter,
> > you’ll need to add a ‘public result visit(Window e)’ method in
> > RelToSqlConverter.
> >
> > I think it is a Project with RexOver expressions rather than Window. Here
> > is the explain plan:
> > 'PLAN'
> > 'JdbcToEnumerableConverter
> >  JdbcProject(id=[$0], device_id=[$1], transaction_value=[$2],
> > account_id=[$3], ts_millis=[$4], last_version_number=[MAX($4) OVER
> > (PARTITION BY $1 RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED
> > FOLLOWING)])
> >JdbcFilter(condition=[=($1, 1445)])
> >  JdbcTableScan(table=[[HAWQ, transaction]])
> > '
> > So i guess the visit(Window e) wouldn't work in this case?
> >
> > Cheers,
> > Christian
> >
> >
> >
> >
> >
> >
> > On 22 November 2016 at 20:18, Julian Hyde  wrote:
> >
> >> It’s a missing feature (or bug), so definitely needs a JIRA case. We
> have
> >> other JIRA cases for deficiencies in the JDBC adapter, e.g.
> >> https://issues.apache.org/jira/browse/CALCITE-981 <
> >> https://issues.apache.org/jira/browse/CALCITE-981>,
> >> https://issues.apache.org/jira/browse/CALCITE-734 <
> >> https://issues.apache.org/jira/browse/CALCITE-734>.
> >>
> >> I don’t know whether you’re seeing a Project where some of the
> expressions
> >> are RexOver, or whether you’re seeing a Window. If the latter, you’ll
> need
> >> to add a ‘public result visit(Window e)’ method in RelToSqlConverter.
> >>
> >> Different JDBC providers support different features, and have different
> >> syntax. For those difference, add methods to SqlDialect.
> >>
> >> You’ll need to add tests to JdbcAdapterTest and unfortunately the
> default
> >> database, hsqldb, doesn’t support window functions, so you’ll need to
> add
> >> some tests that use postgres.
> >>
> >> Julian
> >>
> >>
> >>> On Nov 22, 2016, at 10:01 AM, Christian Tzolov 
> >> wrote:
> >>>
> >>> Hi there,
> >>>
> >>> Using jdbc adapter in the latest calcite (1.11.0) build i can not
> perform
> >>> windows aggregations. While converting the Rel into SQL the
> >>> RelImplementation drops the 'OVER(...)' blocks.
> >>>
> >>> For example if you try the following sql query against Postgres db:
> >>>
> >>> SELECT "id", "device_id", "transaction_value", "account_id",
> "ts_millis",
> >>>  MAX("ts_millis") OVER(partition by "device_id") as
> >>> "last_version_number"
> >>> FROM "HAWQ"."transaction"
> >>> WHERE "device_id" = 1445
> >>>
> >>> The result sql would look like this:
> >>>
> >>> [SELECT "id", "device_id", "transaction_value", "account_id",
> >> "ts_millis",
> >>>MAX("ts_millis") AS "last_version_number"
> >>> FROM "transaction"
> >>> WHERE "device_id" = 1445]
> >>>
> >>> e.g. the entire OVER block is ignored.
> >>>
> >>> I've managed to track the problem down the SqlImplementeror#571-7594 (
> >>> https://github.com/apache/calcite/blob/master/core/src/
> >> main/java/org/apache/calcite/rel/rel2sql/SqlImplementor.java#L571-L594
> >>> )
> >>>
> >>> There the RexOver is casted to (and handled as) RexCall instance and
> the
> >>> RexOver#window is completely ignored.
> >>>
> >>> This looks like a problem to me. Shall i open a Jira ticket for it?
> >>>
> >>> Also what should be the right way to include the RexWindow attribute in
> >> the
> >>> SQL generation?
> >>>
> >>> Cheers,
> >>> Christian
> >>>
> >>> --
> >>> Christian Tzolov <http://www.linkedin.com/in/tzolov> | Solution
> >> Architect,
> >>> EMEA Practice Team | Pivotal <http://pivotal.io/>
> >>> ctzo...@pivotal.io|+31610285517
> >>
>
>


-- 
Christian Tzolov <http://www.linkedin.com/in/tzolov> | Solution Architect,
EMEA Practice Team | Pivotal <http://pivotal.io/>
ctzo...@pivotal.io|+31610285517


[jira] [Created] (CALCITE-1506) Push OVER Clause to underlying SQL via JDBC adapter

2016-11-23 Thread Christian Tzolov (JIRA)
Christian Tzolov created CALCITE-1506:
-

 Summary: Push OVER Clause to underlying SQL via JDBC adapter
 Key: CALCITE-1506
 URL: https://issues.apache.org/jira/browse/CALCITE-1506
 Project: Calcite
  Issue Type: Bug
  Components: jdbc-adapter
Affects Versions: 1.10.0
Reporter: Christian Tzolov
Assignee: Julian Hyde


The jdbc adapter adapter should push down the OVER clause  for all dialects 
that support window functions. 

At the moment the Rel to SQL conversion ignores the 'OVER(...)'. The RexOver 
expression is treated as a plain RexCall and the RexOver#window attribute is 
not converted into SQL. 

For example if the following sql query (using Postgres dialect): 
{code:sql}
SELECT "id", "device_id", "transaction_value", "account_id", "ts_millis", 
MAX("ts_millis") OVER(partition by "device_id") as "last_version_number" 
FROM "HAWQ"."transaction"
WHERE "device_id" = 1445
{code}
is pushed down to the jdbc like this:
{code:sql}
SELECT "id", "device_id", "transaction_value", "account_id", "ts_millis", 
MAX("ts_millis") AS "last_version_number"
FROM "transaction"
WHERE "device_id" = 1445
{code}
The OVER clause is completely dropped!  Here is the plan:
{code}
JdbcToEnumerableConverter
  JdbcProject(id=[$0], device_id=[$1], transaction_value=[$2], account_id=[$3], 
ts_millis=[$4], last_version_number=[MAX($4) OVER (PARTITION BY $1 RANGE 
BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)])
JdbcFilter(condition=[=($1, 1445)])
  JdbcTableScan(table=[[HAWQ, transaction]])
{code}




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: SqlImplementer doesn't handle RexOver when converting Rel into SQL

2016-11-22 Thread Christian Tzolov
Thanks Julian,

I will open JIRA ticket and will add test for postgres dialect. Regarding
your question:

> I don’t know whether you’re seeing a Project where some of the
expressions are RexOver, or whether you’re seeing a Window. If the latter,
you’ll need to add a ‘public result visit(Window e)’ method in
RelToSqlConverter.

I think it is a Project with RexOver expressions rather than Window. Here
is the explain plan:
'PLAN'
'JdbcToEnumerableConverter
  JdbcProject(id=[$0], device_id=[$1], transaction_value=[$2],
account_id=[$3], ts_millis=[$4], last_version_number=[MAX($4) OVER
(PARTITION BY $1 RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED
FOLLOWING)])
JdbcFilter(condition=[=($1, 1445)])
  JdbcTableScan(table=[[HAWQ, transaction]])
'
So i guess the visit(Window e) wouldn't work in this case?

Cheers,
Christian






On 22 November 2016 at 20:18, Julian Hyde  wrote:

> It’s a missing feature (or bug), so definitely needs a JIRA case. We have
> other JIRA cases for deficiencies in the JDBC adapter, e.g.
> https://issues.apache.org/jira/browse/CALCITE-981 <
> https://issues.apache.org/jira/browse/CALCITE-981>,
> https://issues.apache.org/jira/browse/CALCITE-734 <
> https://issues.apache.org/jira/browse/CALCITE-734>.
>
> I don’t know whether you’re seeing a Project where some of the expressions
> are RexOver, or whether you’re seeing a Window. If the latter, you’ll need
> to add a ‘public result visit(Window e)’ method in RelToSqlConverter.
>
> Different JDBC providers support different features, and have different
> syntax. For those difference, add methods to SqlDialect.
>
> You’ll need to add tests to JdbcAdapterTest and unfortunately the default
> database, hsqldb, doesn’t support window functions, so you’ll need to add
> some tests that use postgres.
>
> Julian
>
>
> > On Nov 22, 2016, at 10:01 AM, Christian Tzolov 
> wrote:
> >
> > Hi there,
> >
> > Using jdbc adapter in the latest calcite (1.11.0) build i can not perform
> > windows aggregations. While converting the Rel into SQL the
> > RelImplementation drops the 'OVER(...)' blocks.
> >
> > For example if you try the following sql query against Postgres db:
> >
> > SELECT "id", "device_id", "transaction_value", "account_id", "ts_millis",
> >   MAX("ts_millis") OVER(partition by "device_id") as
> > "last_version_number"
> > FROM "HAWQ"."transaction"
> > WHERE "device_id" = 1445
> >
> > The result sql would look like this:
> >
> > [SELECT "id", "device_id", "transaction_value", "account_id",
> "ts_millis",
> > MAX("ts_millis") AS "last_version_number"
> > FROM "transaction"
> > WHERE "device_id" = 1445]
> >
> > e.g. the entire OVER block is ignored.
> >
> > I've managed to track the problem down the SqlImplementeror#571-7594 (
> > https://github.com/apache/calcite/blob/master/core/src/
> main/java/org/apache/calcite/rel/rel2sql/SqlImplementor.java#L571-L594
> > )
> >
> > There the RexOver is casted to (and handled as) RexCall instance and the
> > RexOver#window is completely ignored.
> >
> > This looks like a problem to me. Shall i open a Jira ticket for it?
> >
> > Also what should be the right way to include the RexWindow attribute in
> the
> > SQL generation?
> >
> > Cheers,
> > Christian
> >
> > --
> > Christian Tzolov <http://www.linkedin.com/in/tzolov> | Solution
> Architect,
> > EMEA Practice Team | Pivotal <http://pivotal.io/>
> > ctzo...@pivotal.io|+31610285517
>


SqlImplementer doesn't handle RexOver when converting Rel into SQL

2016-11-22 Thread Christian Tzolov
Hi there,

Using jdbc adapter in the latest calcite (1.11.0) build i can not perform
windows aggregations. While converting the Rel into SQL the
RelImplementation drops the 'OVER(...)' blocks.

For example if you try the following sql query against Postgres db:

SELECT "id", "device_id", "transaction_value", "account_id", "ts_millis",
   MAX("ts_millis") OVER(partition by "device_id") as
"last_version_number"
FROM "HAWQ"."transaction"
WHERE "device_id" = 1445

The result sql would look like this:

[SELECT "id", "device_id", "transaction_value", "account_id", "ts_millis",
 MAX("ts_millis") AS "last_version_number"
FROM "transaction"
WHERE "device_id" = 1445]

e.g. the entire OVER block is ignored.

I've managed to track the problem down the SqlImplementeror#571-7594 (
https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/rel/rel2sql/SqlImplementor.java#L571-L594
)

There the RexOver is casted to (and handled as) RexCall instance and the
RexOver#window is completely ignored.

This looks like a problem to me. Shall i open a Jira ticket for it?

Also what should be the right way to include the RexWindow attribute in the
SQL generation?

Cheers,
Christian

-- 
Christian Tzolov <http://www.linkedin.com/in/tzolov> | Solution Architect,
EMEA Practice Team | Pivotal <http://pivotal.io/>
ctzo...@pivotal.io|+31610285517


Re: Calcite jdbc driver dependencies

2016-11-21 Thread Christian Tzolov
Thanks for the reference Gethin!

Although CALCITE-1461
<https://issues.apache.org/jira/browse/CALCITE-1461> seems
related, IMO the issue is different.

I am using an up to date 1.11.0 codebase (that contains the 1461 patch).
The problem occurs while trying to create the jenino CompileFactory, before
the ClassDeclaration is being used.

As an attempt to bypass the "org.codehaus.commons.compiler.properties" I've
tried to replace the "compilerFactory =
CompilerFactoryFactory.getDefaultCompilerFactory();"   by "compilerFactory
=
CompilerFactoryFactory.getCompilerFactory("org.codehaus.janino.CompilerFactory");"

Still the problem remains. The CompilerFactory class cannot be found by the
 Thread.currentThread().getContextClassLoader().  The
"org.codehaus.janino.CompilerFactory" class is present in the Uber JDBC jar
but for some reasons the thread context can not reach it.

May be it is time to open a JIRA ticket and continue the discussion there.
Just need to find a proper way to reproduce it.

Cheers,
Christian



On 21 November 2016 at 09:33, Gethin James 
wrote:

> Have you had a look at https://issues.apache.org/jira/browse/CALCITE-1461
> ? The fix version is 1.11.0.
>
> Best regards,
> Gethin.
> 
> From: Christian Tzolov 
> Sent: 21 November 2016 09:23:51
> To: dev@calcite.apache.org
> Subject: Re: Calcite jdbc driver dependencies
>
> Further digging suggests that the root cause of the problem is Janino not
> being able to resolve its 'org.codehaus.commons.compiler.properties' from
> the class path.
>
> The input stream is resolved to null in the
> CompilerFactoryFactory#getDefaultCompilerFactory:
>
> 'InputStream is =
> Thread.currentThread().getContextClassLoader().getResourceAsStream("org.
> codehaus.commons.compiler.properties");'
>
> The is is null although the 'org.codehaus.commons.compiler.properties' is
> available at the root for the uber jdbc jar. Apparently the
> Thread.currentThread().getContextClassLoader() is relative to something
> else but the root classpath.
>
> Any suggestions will be appreciated :) I've noticed similar issues reported
> on the Phoenix project.
>
> Cheers,
> Christian
>
>
>
> On 21 November 2016 at 00:44, Christian Tzolov  wrote:
>
> > The underlying exception causing the "Unable to instantiate java
> compiler" seems
> > to be:
> >
> > *Caused by: java.lang.ClassNotFoundException: No implementation of
> > org.codehaus.commons.compiler is on the class path. Typically, you'd have
> > 'janino.jar', or 'commons-compiler-jdk.jar', or both on the classpath.*
> > * at
> > org.codehaus.commons.compiler.CompilerFactoryFactory.
> getDefaultCompilerFactory(CompilerFactoryFactory.java:59)*
> >
> > * at
> > org.apache.calcite.rel.metadata.JaninoRelMetadataProvider.compile(
> JaninoRelMetadataProvider.java:425)*
> > I've tried to manually add the 'janino.jar' and
> 'commons-compiler-jdk.jar'
> > jars as well as excluding them from the uber jdbc jar. But the error is
> > still there
> >
> > On 21 November 2016 at 00:18, Christian Tzolov 
> wrote:
> >
> >> The uberized jdbc adapter fails with "Unable to instantiate java
> >> compiler" when i try it with tools like DbVisualizer or SQurriel .
> >>
> >> Noticed same error has been discussed in two others thread but there is
> >> no resolution. In my case the adapter pom contains a single janino-2.7.6
> >> dependency (via the calcite-core).
> >>
> >> Could this be related to the maven-shade-plugin configuration?
> >>
> >>
> >> On 20 November 2016 at 23:52, Christian Tzolov 
> >> wrote:
> >>
> >>> FYI, the uber jar jdbc did the job
> >>>
> >>> On 20 November 2016 at 22:34, Christian Tzolov 
> >>> wrote:
> >>>
> >>>> Hi,
> >>>>
> >>>> I've been working on Calcite adapter for a nosql datastore.
> >>>>
> >>>> The prototype works fine when i test it with the Sqlline command line
> >>>> tool. But it is not entirely clear how to use it through a jdbc
> driver from
> >>>> 3rd party tools. I've read the avatica docs but not sure if i'm
> getting it
> >>>> right.
> >>>>
> >>>> For the loca jdbc driver case do i need build my adapter as an uber
> jar
> >>>> to bundle all necessary dependencies? If so are there any examples
> around?
> >>>>
> >>>> For the remote jdbc mode do i need to create and build a customer
> >>>> avatica server that pulls my adapters dependencies?
> >>>>
> >>>> I am sorry if this question has been answered already
> >>>>
> >>>> Cheers,
> >>>> Christian
> >>>>
> >>>>
> >>>
> >>>
> >>>
> >>>
> >>
> >>
> >
>


Re: Calcite jdbc driver dependencies

2016-11-21 Thread Christian Tzolov
Further digging suggests that the root cause of the problem is Janino not
being able to resolve its 'org.codehaus.commons.compiler.properties' from
the class path.

The input stream is resolved to null in the
CompilerFactoryFactory#getDefaultCompilerFactory:

'InputStream is =
Thread.currentThread().getContextClassLoader().getResourceAsStream("org.codehaus.commons.compiler.properties");'

The is is null although the 'org.codehaus.commons.compiler.properties' is
available at the root for the uber jdbc jar. Apparently the
Thread.currentThread().getContextClassLoader() is relative to something
else but the root classpath.

Any suggestions will be appreciated :) I've noticed similar issues reported
on the Phoenix project.

Cheers,
Christian



On 21 November 2016 at 00:44, Christian Tzolov  wrote:

> The underlying exception causing the "Unable to instantiate java compiler" 
> seems
> to be:
>
> *Caused by: java.lang.ClassNotFoundException: No implementation of
> org.codehaus.commons.compiler is on the class path. Typically, you'd have
> 'janino.jar', or 'commons-compiler-jdk.jar', or both on the classpath.*
> * at
> org.codehaus.commons.compiler.CompilerFactoryFactory.getDefaultCompilerFactory(CompilerFactoryFactory.java:59)*
>
> * at
> org.apache.calcite.rel.metadata.JaninoRelMetadataProvider.compile(JaninoRelMetadataProvider.java:425)*
> I've tried to manually add the 'janino.jar' and 'commons-compiler-jdk.jar'
> jars as well as excluding them from the uber jdbc jar. But the error is
> still there
>
> On 21 November 2016 at 00:18, Christian Tzolov  wrote:
>
>> The uberized jdbc adapter fails with "Unable to instantiate java
>> compiler" when i try it with tools like DbVisualizer or SQurriel .
>>
>> Noticed same error has been discussed in two others thread but there is
>> no resolution. In my case the adapter pom contains a single janino-2.7.6
>> dependency (via the calcite-core).
>>
>> Could this be related to the maven-shade-plugin configuration?
>>
>>
>> On 20 November 2016 at 23:52, Christian Tzolov 
>> wrote:
>>
>>> FYI, the uber jar jdbc did the job
>>>
>>> On 20 November 2016 at 22:34, Christian Tzolov 
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>> I've been working on Calcite adapter for a nosql datastore.
>>>>
>>>> The prototype works fine when i test it with the Sqlline command line
>>>> tool. But it is not entirely clear how to use it through a jdbc driver from
>>>> 3rd party tools. I've read the avatica docs but not sure if i'm getting it
>>>> right.
>>>>
>>>> For the loca jdbc driver case do i need build my adapter as an uber jar
>>>> to bundle all necessary dependencies? If so are there any examples around?
>>>>
>>>> For the remote jdbc mode do i need to create and build a customer
>>>> avatica server that pulls my adapters dependencies?
>>>>
>>>> I am sorry if this question has been answered already
>>>>
>>>> Cheers,
>>>> Christian
>>>>
>>>>
>>>
>>>
>>>
>>>
>>
>>
>


Re: Calcite jdbc driver dependencies

2016-11-20 Thread Christian Tzolov
The underlying exception causing the "Unable to instantiate java
compiler" seems
to be:

*Caused by: java.lang.ClassNotFoundException: No implementation of
org.codehaus.commons.compiler is on the class path. Typically, you'd have
'janino.jar', or 'commons-compiler-jdk.jar', or both on the classpath.*
* at
org.codehaus.commons.compiler.CompilerFactoryFactory.getDefaultCompilerFactory(CompilerFactoryFactory.java:59)*

* at
org.apache.calcite.rel.metadata.JaninoRelMetadataProvider.compile(JaninoRelMetadataProvider.java:425)*
I've tried to manually add the 'janino.jar' and 'commons-compiler-jdk.jar'
jars as well as excluding them from the uber jdbc jar. But the error is
still there

On 21 November 2016 at 00:18, Christian Tzolov  wrote:

> The uberized jdbc adapter fails with "Unable to instantiate java compiler"
> when i try it with tools like DbVisualizer or SQurriel .
>
> Noticed same error has been discussed in two others thread but there is no
> resolution. In my case the adapter pom contains a single janino-2.7.6
> dependency (via the calcite-core).
>
> Could this be related to the maven-shade-plugin configuration?
>
>
> On 20 November 2016 at 23:52, Christian Tzolov  wrote:
>
>> FYI, the uber jar jdbc did the job
>>
>> On 20 November 2016 at 22:34, Christian Tzolov 
>> wrote:
>>
>>> Hi,
>>>
>>> I've been working on Calcite adapter for a nosql datastore.
>>>
>>> The prototype works fine when i test it with the Sqlline command line
>>> tool. But it is not entirely clear how to use it through a jdbc driver from
>>> 3rd party tools. I've read the avatica docs but not sure if i'm getting it
>>> right.
>>>
>>> For the loca jdbc driver case do i need build my adapter as an uber jar
>>> to bundle all necessary dependencies? If so are there any examples around?
>>>
>>> For the remote jdbc mode do i need to create and build a customer
>>> avatica server that pulls my adapters dependencies?
>>>
>>> I am sorry if this question has been answered already
>>>
>>> Cheers,
>>> Christian
>>>
>>>
>>
>>
>>
>>
>
>


Re: Calcite jdbc driver dependencies

2016-11-20 Thread Christian Tzolov
The uberized jdbc adapter fails with "Unable to instantiate java compiler"
when i try it with tools like DbVisualizer or SQurriel .

Noticed same error has been discussed in two others thread but there is no
resolution. In my case the adapter pom contains a single janino-2.7.6
dependency (via the calcite-core).

Could this be related to the maven-shade-plugin configuration?


On 20 November 2016 at 23:52, Christian Tzolov  wrote:

> FYI, the uber jar jdbc did the job
>
> On 20 November 2016 at 22:34, Christian Tzolov  wrote:
>
>> Hi,
>>
>> I've been working on Calcite adapter for a nosql datastore.
>>
>> The prototype works fine when i test it with the Sqlline command line
>> tool. But it is not entirely clear how to use it through a jdbc driver from
>> 3rd party tools. I've read the avatica docs but not sure if i'm getting it
>> right.
>>
>> For the loca jdbc driver case do i need build my adapter as an uber jar
>> to bundle all necessary dependencies? If so are there any examples around?
>>
>> For the remote jdbc mode do i need to create and build a customer avatica
>> server that pulls my adapters dependencies?
>>
>> I am sorry if this question has been answered already
>>
>> Cheers,
>> Christian
>>
>>
>
>
>
>


Re: Calcite jdbc driver dependencies

2016-11-20 Thread Christian Tzolov
FYI, the uber jar jdbc did the job

On 20 November 2016 at 22:34, Christian Tzolov  wrote:

> Hi,
>
> I've been working on Calcite adapter for a nosql datastore.
>
> The prototype works fine when i test it with the Sqlline command line
> tool. But it is not entirely clear how to use it through a jdbc driver from
> 3rd party tools. I've read the avatica docs but not sure if i'm getting it
> right.
>
> For the loca jdbc driver case do i need build my adapter as an uber jar to
> bundle all necessary dependencies? If so are there any examples around?
>
> For the remote jdbc mode do i need to create and build a customer avatica
> server that pulls my adapters dependencies?
>
> I am sorry if this question has been answered already
>
> Cheers,
> Christian
>
>


-- 
Christian Tzolov <http://www.linkedin.com/in/tzolov> | Solution Architect,
EMEA Practice Team | Pivotal <http://pivotal.io/>
ctzo...@pivotal.io|+31610285517


Calcite jdbc driver dependencies

2016-11-20 Thread Christian Tzolov
Hi,

I've been working on Calcite adapter for a nosql datastore.

The prototype works fine when i test it with the Sqlline command line tool.
But it is not entirely clear how to use it through a jdbc driver from 3rd
party tools. I've read the avatica docs but not sure if i'm getting it
right.

For the loca jdbc driver case do i need build my adapter as an uber jar to
bundle all necessary dependencies? If so are there any examples around?

For the remote jdbc mode do i need to create and build a customer avatica
server that pulls my adapters dependencies?

I am sorry if this question has been answered already

Cheers,
Christian


Re: Modifying the AST (SqlNode) on the fly

2016-11-01 Thread Christian Tzolov
Sounds great!

I am concerned by this remark in the documentation: "The JDBC adapter
currently only pushes down table scan operations".
Is there an easy way to pass the the whole (new) query to the backend
database? The query would always concern tables from a single database
(e.g. no multiple jdbc sources).

On 1 November 2016 at 23:58, Julian Hyde  wrote:

> Definitely re-use the existing JDBC adapter. Within the adapter are ways
> to tweak (a) the dialect of the SQL you generate, (b) the capabilities of
> the DB (e.g. whether it supports OFFSET). Tweaking those knobs is a lot
> easier than building a new adapter.
>
> Julian
>
>
> > On Nov 1, 2016, at 3:55 PM, Christian Tzolov  wrote:
> >
> > Thanks again!
> >
> > Would it make sense to reuse/extend the existing jdbc adapter or i better
> > start from scratch?
> > Since my backend DB uses postgres dialect i wonder what is the easiest
> way
> > to modify the relations and pass the whole query to the target DB.
> >
> > On 1 November 2016 at 22:42, Julian Hyde  jh...@apache.org>> wrote:
> >
> >> You might find that you only need to change the root node (TableModify)
> >> from UPDATE to INSERT, plus maybe a Project immediately underneath it.
> You
> >> can re-use the parts of the tree that you don’t change. This is typical
> of
> >> how planner rules work.
> >>
> >>> On Nov 1, 2016, at 2:38 PM, Christian Tzolov 
> wrote:
> >>>
> >>> Thanks Julian!
> >>>
> >>> So i can override the whole RelNode tree from UPDATE to INSERT for
> >> example?
> >>> Was not sure if this is allowed in the RelNode phase.
> >>>
> >>> I guess as a start i need to implement my own TableModify relation and
> >>> related rule to substitute the LogicalTableModify and alter the
> >>> underlying operation.
> >>>
> >>> Is the there some sample code doing similar relation tree overriding?
> >>>
> >>> Thanks,
> >>> Christian
> >>>
> >>>
> >>>
> >>> On 1 November 2016 at 20:21, Julian Hyde  jh...@apache.org>  >> jh...@apache.org <mailto:jh...@apache.org>>> wrote:
> >>>
> >>>> Calcite is the right tool for the job, but our experience is that
> >> hacking
> >>>> the AST is not the way to do it. You can do simple transformations on
> >> the
> >>>> AST, but SQL has complex semantics (e.g. consider the rules required
> to
> >>>> look up an unqualified column name in a sub-query), so complex
> >>>> transformations tend to break.
> >>>>
> >>>> Calcite is a query transformation engine, and the best place to do
> >>>> transformations is on the relational algebra (RelNode) rather than AST
> >>>> (SqlNode).
> >>>>
> >>>> Of course, you’re welcome to use just Calcite’s SQL parser, work on
> the
> >>>> AST, and then convert the AST back to SQL. It may be sufficient for
> your
> >>>> task - it’s your call.
> >>>>
> >>>> Julian
> >>>>
> >>>>
> >>>>> On Nov 1, 2016, at 12:14 PM, Christian Tzolov  <mailto:ctzo...@pivotal.io>>
> >>>> wrote:
> >>>>>
> >>>>> Hi guys,
> >>>>>
> >>>>> I am looking for a solution/approach that would allow me intercept
> and
> >>>> JDBC
> >>>>> call, replace an UPDATE statement by related INSERT and run the new
> SQL
> >>>> on
> >>>>> the backend database (using JDBC). Similarly on SELECT i would like
> to
> >>>> add
> >>>>> a filter to the existing statement.
> >>>>>
> >>>>> My big-data DB doesn't support UPDATES so i am looking for a way to
> >>>> emulate
> >>>>> UPDATEs by converting them into INSERTs, append records with
> additional
> >>>>> 'timestamp' column.  On read the SELECT would be extended with
> >> additional
> >>>>> Filter to retrieve the records with most recent 'timestamp'.
> >>>>>
> >>>>> Is Calcite the right tool to tackle such problem? Or are there any
> >> better
> >>>>> alternatives?
> >>>>>
> >>>>> If Calcite is the way to go, it seems that the right place to perform
> >>>> such
> >>>>> query alteration is at SqlNode AST before the RelNode?
> >>>>>
> >>>>> For the existing JDBC Adapter is there a way to pass the whole SQL
> >> query
> >>>> to
> >>>>> the backend database? There query will always be based on tables
> from a
> >>>>> single JDBC database.
> >>>>>
> >>>>> Thanks,
> >>>>> Christian
> >>>>>
> >>>>> --
> >>>>>
> >>>>
> >>>>
> >>>>
> >>> --
> >>
> >
> >
> >
> > --
> > Christian Tzolov <http://www.linkedin.com/in/tzolov <
> http://www.linkedin.com/in/tzolov>> | Solution Architect,
> > EMEA Practice Team | Pivotal <http://pivotal.io/ <http://pivotal.io/>>
> > ctzo...@pivotal.io <mailto:ctzo...@pivotal.io>|+31610285517
>



-- 
Christian Tzolov <http://www.linkedin.com/in/tzolov> | Solution Architect,
EMEA Practice Team | Pivotal <http://pivotal.io/>
ctzo...@pivotal.io|+31610285517


Re: Modifying the AST (SqlNode) on the fly

2016-11-01 Thread Christian Tzolov
Thanks again!

Would it make sense to reuse/extend the existing jdbc adapter or i better
start from scratch?
Since my backend DB uses postgres dialect i wonder what is the easiest way
to modify the relations and pass the whole query to the target DB.

On 1 November 2016 at 22:42, Julian Hyde  wrote:

> You might find that you only need to change the root node (TableModify)
> from UPDATE to INSERT, plus maybe a Project immediately underneath it. You
> can re-use the parts of the tree that you don’t change. This is typical of
> how planner rules work.
>
> > On Nov 1, 2016, at 2:38 PM, Christian Tzolov  wrote:
> >
> > Thanks Julian!
> >
> > So i can override the whole RelNode tree from UPDATE to INSERT for
> example?
> > Was not sure if this is allowed in the RelNode phase.
> >
> > I guess as a start i need to implement my own TableModify relation and
> > related rule to substitute the LogicalTableModify and alter the
> > underlying operation.
> >
> > Is the there some sample code doing similar relation tree overriding?
> >
> > Thanks,
> > Christian
> >
> >
> >
> > On 1 November 2016 at 20:21, Julian Hyde  jh...@apache.org>> wrote:
> >
> >> Calcite is the right tool for the job, but our experience is that
> hacking
> >> the AST is not the way to do it. You can do simple transformations on
> the
> >> AST, but SQL has complex semantics (e.g. consider the rules required to
> >> look up an unqualified column name in a sub-query), so complex
> >> transformations tend to break.
> >>
> >> Calcite is a query transformation engine, and the best place to do
> >> transformations is on the relational algebra (RelNode) rather than AST
> >> (SqlNode).
> >>
> >> Of course, you’re welcome to use just Calcite’s SQL parser, work on the
> >> AST, and then convert the AST back to SQL. It may be sufficient for your
> >> task - it’s your call.
> >>
> >> Julian
> >>
> >>
> >>> On Nov 1, 2016, at 12:14 PM, Christian Tzolov 
> >> wrote:
> >>>
> >>> Hi guys,
> >>>
> >>> I am looking for a solution/approach that would allow me intercept and
> >> JDBC
> >>> call, replace an UPDATE statement by related INSERT and run the new SQL
> >> on
> >>> the backend database (using JDBC). Similarly on SELECT i would like to
> >> add
> >>> a filter to the existing statement.
> >>>
> >>> My big-data DB doesn't support UPDATES so i am looking for a way to
> >> emulate
> >>> UPDATEs by converting them into INSERTs, append records with additional
> >>> 'timestamp' column.  On read the SELECT would be extended with
> additional
> >>> Filter to retrieve the records with most recent 'timestamp'.
> >>>
> >>> Is Calcite the right tool to tackle such problem? Or are there any
> better
> >>> alternatives?
> >>>
> >>> If Calcite is the way to go, it seems that the right place to perform
> >> such
> >>> query alteration is at SqlNode AST before the RelNode?
> >>>
> >>> For the existing JDBC Adapter is there a way to pass the whole SQL
> query
> >> to
> >>> the backend database? There query will always be based on tables from a
> >>> single JDBC database.
> >>>
> >>> Thanks,
> >>> Christian
> >>>
> >>> --
> >>>
> >>
> >>
> >>
> > --
>



-- 
Christian Tzolov <http://www.linkedin.com/in/tzolov> | Solution Architect,
EMEA Practice Team | Pivotal <http://pivotal.io/>
ctzo...@pivotal.io|+31610285517


Re: Modifying the AST (SqlNode) on the fly

2016-11-01 Thread Christian Tzolov
Thanks Julian!

So i can override the whole RelNode tree from UPDATE to INSERT for example?
Was not sure if this is allowed in the RelNode phase.

I guess as a start i need to implement my own TableModify relation and
related rule to substitute the LogicalTableModify and alter the
underlying operation.

Is the there some sample code doing similar relation tree overriding?

Thanks,
Christian



On 1 November 2016 at 20:21, Julian Hyde  wrote:

> Calcite is the right tool for the job, but our experience is that hacking
> the AST is not the way to do it. You can do simple transformations on the
> AST, but SQL has complex semantics (e.g. consider the rules required to
> look up an unqualified column name in a sub-query), so complex
> transformations tend to break.
>
> Calcite is a query transformation engine, and the best place to do
> transformations is on the relational algebra (RelNode) rather than AST
> (SqlNode).
>
> Of course, you’re welcome to use just Calcite’s SQL parser, work on the
> AST, and then convert the AST back to SQL. It may be sufficient for your
> task - it’s your call.
>
> Julian
>
>
> > On Nov 1, 2016, at 12:14 PM, Christian Tzolov 
> wrote:
> >
> > Hi guys,
> >
> > I am looking for a solution/approach that would allow me intercept and
> JDBC
> > call, replace an UPDATE statement by related INSERT and run the new SQL
> on
> > the backend database (using JDBC). Similarly on SELECT i would like to
> add
> > a filter to the existing statement.
> >
> > My big-data DB doesn't support UPDATES so i am looking for a way to
> emulate
> > UPDATEs by converting them into INSERTs, append records with additional
> > 'timestamp' column.  On read the SELECT would be extended with additional
> > Filter to retrieve the records with most recent 'timestamp'.
> >
> > Is Calcite the right tool to tackle such problem? Or are there any better
> > alternatives?
> >
> > If Calcite is the way to go, it seems that the right place to perform
> such
> > query alteration is at SqlNode AST before the RelNode?
> >
> > For the existing JDBC Adapter is there a way to pass the whole SQL query
> to
> > the backend database? There query will always be based on tables from a
> > single JDBC database.
> >
> > Thanks,
> > Christian
> >
> > --
> >
>
>
>
--


Modifying the AST (SqlNode) on the fly

2016-11-01 Thread Christian Tzolov
Hi guys,

I am looking for a solution/approach that would allow me intercept and JDBC
call, replace an UPDATE statement by related INSERT and run the new SQL on
the backend database (using JDBC). Similarly on SELECT i would like to add
a filter to the existing statement.

My big-data DB doesn't support UPDATES so i am looking for a way to emulate
UPDATEs by converting them into INSERTs, append records with additional
'timestamp' column.  On read the SELECT would be extended with additional
Filter to retrieve the records with most recent 'timestamp'.

Is Calcite the right tool to tackle such problem? Or are there any better
alternatives?

If Calcite is the way to go, it seems that the right place to perform such
query alteration is at SqlNode AST before the RelNode?

For the existing JDBC Adapter is there a way to pass the whole SQL query to
the backend database? There query will always be based on tables from a
single JDBC database.

Thanks,
Christian

-- 

Christian Tzolov <http://www.linkedin.com/in/tzolov> | Solution Architect,
EMEA Practice Team | Pivotal <http://pivotal.io/>
ctzo...@pivotal.io|+31610285517


Re: How to check for SELECT DISTINCT ...?

2016-05-13 Thread Christian Tzolov
Thanks for the explanation and the suggestions Julian!

After some debugging and enough "explain plan for" i figured out the same
:)  - that DISTINCT is represented as a GROUP BY over the  projected fields.

>>> You should write a rule that matches a Sort on top of an Aggregate that
uses the same keys.

I've created such rule (operand(Sort.class, operand(Aggregate.class, any())),
but i am not sure what is the safest way to determine whether the keys are
the same? Would it be enough to compare their row types or i need check the
collations ?

The suggestion to expand the select in case of unique keys is excellent.
But i have to investigate whether the backend system in they way i'm using
it allows primary keys.

How does RelMdUniqueKeys determines whether the keys are unique? Is this a
meta info provided by the table (or row type) definition or is computed on
the fly?






On 13 May 2016 at 18:54, Julian Hyde  wrote:

> By the time the query is translated into relational algebra, SELECT
> DISTINCT will have been converted into an Aggregate with no aggregate
> functions.
>
> So, for instance,
>
>   select distinct deptno from emp
>
> and
>
>   select deptno from emp group by deptno
>
> will be indistinguishable if you look at the relational algebra.
>
> You should write a rule that matches a Sort on top of an Aggregate
> that uses the same keys.
>
> Also, I don't know whether your system supports primary keys. But if,
> for example, empno is unique, then you would want to be able to
> evaluate
>
>   select empno from emp order by empno
>
> even though there is no explicit SELECT DISTINCT. You can use the
> RelMdUniqueKeys metadata to figure out whether the sort key is already
> unique.
>
> Julian
>
>
> On Fri, May 13, 2016 at 8:22 AM, Tzolov, Christian 
> wrote:
> > Hi there,
> >
> > I am working on a Calcite adapter for a datastore that supports a
> limited subset of SQL expressions. My goal is to push down the supported
> expression.
> >
> > This datastore supports "ORDER BY” only if the SELECT DISTINCT
> expression is used.
> >
> > What would be the best way to check if the select DISTINCT keyword is
> used from within my SortRule implementation?
> >
> > Thanks,
> > Christian
> >
>



-- 
Christian Tzolov <http://www.linkedin.com/in/tzolov> | Solution Architect,
EMEA Practice Team | Pivotal <http://pivotal.io/>
ctzo...@pivotal.io|+31610285517