[jira] [Commented] (CALCITE-2301) JDBC adapter: use query timeout from the top-level statement

2019-01-09 Thread Julian Hyde (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16738621#comment-16738621
 ] 

Julian Hyde commented on CALCITE-2301:
--

Yes, CURRENT_TIMESTAMP should be in the connection's time zone, not UTC. See 
[https://docs.oracle.com/cd/B19306_01/server.102/b14200/functions037.htm] and 
also the SQL standard. Please log it.

> JDBC adapter: use query timeout from the top-level statement
> 
>
> Key: CALCITE-2301
> URL: https://issues.apache.org/jira/browse/CALCITE-2301
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.16.0
>Reporter: Fan Yang
>Assignee: Julian Hyde
>Priority: Minor
>  Labels: pull-request-available
> Fix For: next
>
>
> It's not a good idea to have the magic number here. Also, databases may not 
> get back within 10 second for various reasons (e.g., in the case of JDBC 
> schema).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2301) JDBC adapter: use query timeout from the top-level statement

2019-01-09 Thread Vladimir Sitnikov (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16738618#comment-16738618
 ] 

Vladimir Sitnikov commented on CALCITE-2301:


By the way, 

{code:java}UTC_TIMESTAMP("utcTimestamp", Long.class),

/** The time at which the current statement started executing. In
 * milliseconds after 1970-01-01 00:00:00, UTC. Required. */  <===  UTC
CURRENT_TIMESTAMP("currentTimestamp", Long.class),

/** The time at which the current statement started executing. In
 * milliseconds after 1970-01-01 00:00:00, in the time zone of the current
 * statement. Required. */
LOCAL_TIMESTAMP("localTimestamp", Long.class),


  final TimeZone timeZone = connection.getTimeZone();
  final long localOffset = timeZone.getOffset(time);
  final long currentOffset = localOffset;
...
  builder.put(Variable.UTC_TIMESTAMP.camelName, time)
  .put(Variable.CURRENT_TIMESTAMP.camelName, time + currentOffset)  
<=== offset
  .put(Variable.LOCAL_TIMESTAMP.camelName, time + localOffset)
  .put(Variable.TIME_ZONE.camelName, timeZone)
{code}

Frankly speaking, I don't like the concept of "time + currentOffset", however 
the more surprising fact is JavaDoc for CURRENT_TIMESTAMP reads {{milliseconds 
after 1970-01-01 00:00:00, UTC}}

> JDBC adapter: use query timeout from the top-level statement
> 
>
> Key: CALCITE-2301
> URL: https://issues.apache.org/jira/browse/CALCITE-2301
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.16.0
>Reporter: Fan Yang
>Assignee: Julian Hyde
>Priority: Minor
>  Labels: pull-request-available
> Fix For: next
>
>
> It's not a good idea to have the magic number here. Also, databases may not 
> get back within 10 second for various reasons (e.g., in the case of JDBC 
> schema).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2635) getMonotonocity is slow on wide tables

2019-01-09 Thread Julian Hyde (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16738613#comment-16738613
 ] 

Julian Hyde commented on CALCITE-2635:
--

I have a $350 [Intel 
NUC|https://www.amazon.com/gp/product/B01N2UMKZ5/ref=ppx_od_dt_b_detailpages00?ie=UTF8=1]
 in my home office. It is 99% idle. You're welcome to have ssh access.

> getMonotonocity is slow on wide tables
> --
>
> Key: CALCITE-2635
> URL: https://issues.apache.org/jira/browse/CALCITE-2635
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: Gian Merlino
>Assignee: Gian Merlino
>Priority: Major
>  Labels: performance
> Fix For: 1.19.0
>
>
> RelOptTableImpl's getMonotonocity does an indexOf on 
> {{rowType.getFieldNames()}}, which is O(N) in the number of fields. 
> IdentifierNamespace calls getMonotonicity once for every field in the table 
> namespace, so it becomes O(N^2) in the number of fields. We observed 2-4 
> second query planning times with a table that had 18,000 columns, reduced to 
> about 150ms after patching getMonotonicity to be O(1) in the number of fields.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CALCITE-2635) getMonotonocity is slow on wide tables

2019-01-09 Thread Vladimir Sitnikov (JIRA)


 [ 
https://issues.apache.org/jira/browse/CALCITE-2635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Sitnikov resolved CALCITE-2635.

   Resolution: Fixed
Fix Version/s: 1.19.0

Fixed in 
https://gitbox.apache.org/repos/asf?p=calcite.git;a=commit;h=62b47aeeb7eeb59beaf5b8f3b54a5c58ba4ca76d

> getMonotonocity is slow on wide tables
> --
>
> Key: CALCITE-2635
> URL: https://issues.apache.org/jira/browse/CALCITE-2635
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: Gian Merlino
>Assignee: Gian Merlino
>Priority: Major
>  Labels: performance
> Fix For: 1.19.0
>
>
> RelOptTableImpl's getMonotonocity does an indexOf on 
> {{rowType.getFieldNames()}}, which is O(N) in the number of fields. 
> IdentifierNamespace calls getMonotonicity once for every field in the table 
> namespace, so it becomes O(N^2) in the number of fields. We observed 2-4 
> second query planning times with a table that had 18,000 columns, reduced to 
> about 150ms after patching getMonotonicity to be O(1) in the number of fields.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2301) JDBC adapter: use query timeout from the top-level statement

2019-01-09 Thread Julian Hyde (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16738605#comment-16738605
 ] 

Julian Hyde commented on CALCITE-2301:
--

Yes, I know. I call it "execution time". But it's much better that it is 
exactly the same rather than possibly 1 or 2 milliseconds off. That's what I 
(inaccurately) called a race condition.

> JDBC adapter: use query timeout from the top-level statement
> 
>
> Key: CALCITE-2301
> URL: https://issues.apache.org/jira/browse/CALCITE-2301
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.16.0
>Reporter: Fan Yang
>Assignee: Julian Hyde
>Priority: Minor
>  Labels: pull-request-available
> Fix For: next
>
>
> It's not a good idea to have the magic number here. Also, databases may not 
> get back within 10 second for various reasons (e.g., in the case of JDBC 
> schema).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2635) getMonotonocity is slow on wide tables

2019-01-09 Thread Vladimir Sitnikov (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16738603#comment-16738603
 ] 

Vladimir Sitnikov commented on CALCITE-2635:


I guess one of the most difficult things to get is the hardware which is not 
shared among 100500 projects.

> getMonotonocity is slow on wide tables
> --
>
> Key: CALCITE-2635
> URL: https://issues.apache.org/jira/browse/CALCITE-2635
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: Gian Merlino
>Assignee: Gian Merlino
>Priority: Major
>  Labels: performance
>
> RelOptTableImpl's getMonotonocity does an indexOf on 
> {{rowType.getFieldNames()}}, which is O(N) in the number of fields. 
> IdentifierNamespace calls getMonotonicity once for every field in the table 
> namespace, so it becomes O(N^2) in the number of fields. We observed 2-4 
> second query planning times with a table that had 18,000 columns, reduced to 
> about 150ms after patching getMonotonicity to be O(1) in the number of fields.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2301) JDBC adapter: use query timeout from the top-level statement

2019-01-09 Thread Vladimir Sitnikov (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16738602#comment-16738602
 ] 

Vladimir Sitnikov commented on CALCITE-2301:


By the way, UTC_TIMESTAMP is almost the same as my System.currentTimeMillis 
since UTC_TIMESTAMP is computed quite late. For instance, UTC_TIMESTAMP does 
not include planning time.

> JDBC adapter: use query timeout from the top-level statement
> 
>
> Key: CALCITE-2301
> URL: https://issues.apache.org/jira/browse/CALCITE-2301
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.16.0
>Reporter: Fan Yang
>Assignee: Julian Hyde
>Priority: Minor
>  Labels: pull-request-available
> Fix For: next
>
>
> It's not a good idea to have the magic number here. Also, databases may not 
> get back within 10 second for various reasons (e.g., in the case of JDBC 
> schema).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2301) JDBC adapter: use query timeout from the top-level statement

2019-01-09 Thread Julian Hyde (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16738601#comment-16738601
 ] 

Julian Hyde commented on CALCITE-2301:
--

A test for just PostgreSQL would be fine. I don't expect this to break often.

Yes, using UTC_TIMESTAMP is an improvement. Thanks.

> JDBC adapter: use query timeout from the top-level statement
> 
>
> Key: CALCITE-2301
> URL: https://issues.apache.org/jira/browse/CALCITE-2301
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.16.0
>Reporter: Fan Yang
>Assignee: Julian Hyde
>Priority: Minor
>  Labels: pull-request-available
> Fix For: next
>
>
> It's not a good idea to have the magic number here. Also, databases may not 
> get back within 10 second for various reasons (e.g., in the case of JDBC 
> schema).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2301) JDBC adapter: use query timeout from the top-level statement

2019-01-09 Thread Julian Hyde (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16738599#comment-16738599
 ] 

Julian Hyde commented on CALCITE-2301:
--

Would it be possible to set the deadline in ResultSetEnumerable on 
construction, so that there is less mutable state? After all, we know the 
timeout and we know the query execution start time when the enumerable is 
constructed.

> JDBC adapter: use query timeout from the top-level statement
> 
>
> Key: CALCITE-2301
> URL: https://issues.apache.org/jira/browse/CALCITE-2301
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.16.0
>Reporter: Fan Yang
>Assignee: Julian Hyde
>Priority: Minor
>  Labels: pull-request-available
> Fix For: next
>
>
> It's not a good idea to have the magic number here. Also, databases may not 
> get back within 10 second for various reasons (e.g., in the case of JDBC 
> schema).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2635) getMonotonocity is slow on wide tables

2019-01-09 Thread Julian Hyde (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16738592#comment-16738592
 ] 

Julian Hyde commented on CALCITE-2635:
--

Sure, we can merge this with no extra tests.

But I would be grateful for help getting to a performance testing framework.

> getMonotonocity is slow on wide tables
> --
>
> Key: CALCITE-2635
> URL: https://issues.apache.org/jira/browse/CALCITE-2635
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: Gian Merlino
>Assignee: Gian Merlino
>Priority: Major
>  Labels: performance
>
> RelOptTableImpl's getMonotonocity does an indexOf on 
> {{rowType.getFieldNames()}}, which is O(N) in the number of fields. 
> IdentifierNamespace calls getMonotonicity once for every field in the table 
> namespace, so it becomes O(N^2) in the number of fields. We observed 2-4 
> second query planning times with a table that had 18,000 columns, reduced to 
> about 150ms after patching getMonotonicity to be O(1) in the number of fields.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2635) getMonotonocity is slow on wide tables

2019-01-09 Thread Julian Hyde (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16738586#comment-16738586
 ] 

Julian Hyde commented on CALCITE-2635:
--

I agree, https://arewefastyet.com would be awesome. But we've been talking for 
years about performance regression tests and no one has done anything. Whatever 
approach we take, we will want to annotate in the code which tests are 
considered performance-critical. Then we can gather performance of those tests 
over time, on the same hardware, and look for trends/variance.

The variance of a single-threaded test between the slowest and fastest hardware 
is no more than 5x, whereas the variance between a good and bad algorithm can 
be several orders of magnitude. So it's not too difficult to write a useful 
test.

> getMonotonocity is slow on wide tables
> --
>
> Key: CALCITE-2635
> URL: https://issues.apache.org/jira/browse/CALCITE-2635
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: Gian Merlino
>Assignee: Gian Merlino
>Priority: Major
>  Labels: performance
>
> RelOptTableImpl's getMonotonocity does an indexOf on 
> {{rowType.getFieldNames()}}, which is O(N) in the number of fields. 
> IdentifierNamespace calls getMonotonicity once for every field in the table 
> namespace, so it becomes O(N^2) in the number of fields. We observed 2-4 
> second query planning times with a table that had 18,000 columns, reduced to 
> about 150ms after patching getMonotonicity to be O(1) in the number of fields.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2635) getMonotonocity is slow on wide tables

2019-01-09 Thread Vladimir Sitnikov (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16738551#comment-16738551
 ] 

Vladimir Sitnikov commented on CALCITE-2635:


{quote}@PerformanceTest(expectedDuration = "2s", variance = "5%"){quote}

Expected duration depends on the hardware. For instance, notebook, virtual 
machine, desktop, vps, etc, all could have very different raw performance.

I think it is much better to invest time to having something like 
https://arewefastyet.com
In other words, we could have a set of "standard" benchmarks + consistent 
machine for execution + scheduled executions so we can track regressions.

I'm inclined to merge this fix with no extra tests.


Note: the change is a clear win.
Alternative option is to implement HashMap to speedup 
{{org.apache.calcite.rel.type.RelDataType#getField(String fieldName, boolean 
caseSensitive, boolean elideRecord)}}. We do have 
{{org.apache.calcite.rel.type.RelDataTypeFactoryImpl#canonize(org.apache.calcite.rel.type.RelDataType)}},
 so lazy initialized cache of field positions might help.


However, we don't really expect single table to have lots of collations, so we 
could just go with PR#891
On top of that, we might add a hard limit like "try no more than first 50 
collations of the table", so even a table with extreme amount of collations 
won't create a problem for {{getMonotonocity}}

> getMonotonocity is slow on wide tables
> --
>
> Key: CALCITE-2635
> URL: https://issues.apache.org/jira/browse/CALCITE-2635
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: Gian Merlino
>Assignee: Gian Merlino
>Priority: Major
>  Labels: performance
>
> RelOptTableImpl's getMonotonocity does an indexOf on 
> {{rowType.getFieldNames()}}, which is O(N) in the number of fields. 
> IdentifierNamespace calls getMonotonicity once for every field in the table 
> namespace, so it becomes O(N^2) in the number of fields. We observed 2-4 
> second query planning times with a table that had 18,000 columns, reduced to 
> about 150ms after patching getMonotonicity to be O(1) in the number of fields.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (CALCITE-2635) getMonotonocity is slow on wide tables

2019-01-09 Thread Vladimir Sitnikov (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16738551#comment-16738551
 ] 

Vladimir Sitnikov edited comment on CALCITE-2635 at 1/9/19 7:08 PM:


{quote}@PerformanceTest(expectedDuration = "2s", variance = "5%"){quote}

Expected duration depends on the hardware. For instance, notebook, virtual 
machine, desktop, vps, etc, all could have very different raw performance.

I think it is much better to invest time to having something like 
https://arewefastyet.com
In other words, we could have a set of "standard" benchmarks + consistent 
machine for execution + scheduled executions so we can track regressions.

**I'm inclined to merge this fix with no extra tests.**


Note: the change is a clear win.
Alternative option is to implement HashMap to speedup 
{{org.apache.calcite.rel.type.RelDataType#getField(String fieldName, boolean 
caseSensitive, boolean elideRecord)}}. We do have 
{{org.apache.calcite.rel.type.RelDataTypeFactoryImpl#canonize(org.apache.calcite.rel.type.RelDataType)}},
 so lazy initialized cache of field positions might help.


However, we don't really expect single table to have lots of collations, so we 
could just go with PR#891
On top of that, we might add a hard limit like "try no more than first 50 
collations of the table", so even a table with extreme amount of collations 
won't create a problem for {{getMonotonocity}}


was (Author: vladimirsitnikov):
{quote}@PerformanceTest(expectedDuration = "2s", variance = "5%"){quote}

Expected duration depends on the hardware. For instance, notebook, virtual 
machine, desktop, vps, etc, all could have very different raw performance.

I think it is much better to invest time to having something like 
https://arewefastyet.com
In other words, we could have a set of "standard" benchmarks + consistent 
machine for execution + scheduled executions so we can track regressions.

I'm inclined to merge this fix with no extra tests.


Note: the change is a clear win.
Alternative option is to implement HashMap to speedup 
{{org.apache.calcite.rel.type.RelDataType#getField(String fieldName, boolean 
caseSensitive, boolean elideRecord)}}. We do have 
{{org.apache.calcite.rel.type.RelDataTypeFactoryImpl#canonize(org.apache.calcite.rel.type.RelDataType)}},
 so lazy initialized cache of field positions might help.


However, we don't really expect single table to have lots of collations, so we 
could just go with PR#891
On top of that, we might add a hard limit like "try no more than first 50 
collations of the table", so even a table with extreme amount of collations 
won't create a problem for {{getMonotonocity}}

> getMonotonocity is slow on wide tables
> --
>
> Key: CALCITE-2635
> URL: https://issues.apache.org/jira/browse/CALCITE-2635
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: Gian Merlino
>Assignee: Gian Merlino
>Priority: Major
>  Labels: performance
>
> RelOptTableImpl's getMonotonocity does an indexOf on 
> {{rowType.getFieldNames()}}, which is O(N) in the number of fields. 
> IdentifierNamespace calls getMonotonicity once for every field in the table 
> namespace, so it becomes O(N^2) in the number of fields. We observed 2-4 
> second query planning times with a table that had 18,000 columns, reduced to 
> about 150ms after patching getMonotonicity to be O(1) in the number of fields.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2757) DISTINCT not being handled correctly in RelToSqlConverter

2019-01-09 Thread KrishnaKant Agrawal (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16738352#comment-16738352
 ] 

KrishnaKant Agrawal commented on CALCITE-2757:
--

Hi,

Thanks in advance for the patience required in reading this long comment. 

When I set the needNew flag as true in this case, something weird happens:- 
{code:java}
@Test public void testSelectWithDistinct() {
  String query = "select distinct \"product_id\","
  + " sum(\"product_class_id\") over (partition by \"product_id\") "
  + " from \"product\" ";
  final String expected = "SELECT \"product_id\", \"EXPR$1\"\n"
  + "FROM (SELECT \"product_id\", SUM(\"product_class_id\") OVER "
  + "(PARTITION BY \"product_id\" RANGE BETWEEN UNBOUNDED PRECEDING AND 
UNBOUNDED FOLLOWING)\n"
  + "FROM \"foodmart\".\"product\") AS \"t\"\n"
  + "GROUP BY \"product_id\", \"EXPR$1\"";
  sql(query).withHive().ok(expected);
{code}
 
Actual(Removed default window bounds because of visual clarity but they are 
printed):-
{code:sql}
SELECT \"product_id\", SUM(\"product_class_id\") OVER (PARTITION BY 
\"product_id\" )
FROM (SELECT \"product_id\", SUM(\"product_class_id\") OVER (PARTITION BY 
\"product_id\" )
FROM \"foodmart\".\"product\") AS \"t\"
GROUP BY \"product_id\", SUM(\"product_class_id\") OVER (PARTITION BY 
\"product_id\" )
{code}
Now when we see the EXPR$1, it is the correct thing as the SUM() expression 
wasn't aliased in the original query.

When I set the needNew true, a subquery was formed so the printing the 
expressions of the inner query at the parent level when input columns needed to 
construct that expression in the parent query may not be available anymore, 
seems like a bug.

 

This is not a problem if the SUM() expression was aliased in the original query.

 

This is being done in AliasContext.field() method where we store expressions in 
SqlImplementor.ordinalMap and return them as is from AliasContext.field().

Can somebody shed light on this? 

 

> DISTINCT not being handled correctly in RelToSqlConverter
> -
>
> Key: CALCITE-2757
> URL: https://issues.apache.org/jira/browse/CALCITE-2757
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Reporter: KrishnaKant Agrawal
>Assignee: Julian Hyde
>Priority: Major
>
>   SELECT DISTINCT sum( x ) OVER (PARTITION BY y) FROM t
> is valid (per SQL standard) but
>   SELECT sum( x ) OVER (PARTITION BY y)
>   FROM t
>   GROUP BY sum( x ) OVER (PARTITION BY y)
> is not. For example, given the query
>   select sum(deptno) over (partition by loc)
>   from dept
>   group by  sum(deptno) over (partition by loc);
> Oracle gives
>   ORA-00934: group function is not allowed here
> Therefore we should generate a sub-query, something like this:
>   SELECT c1
>   FROM (
>     SELECT sum(deptno) OVER (PARTITION BY loc)
>     FROM dept) AS t
>   GROUP BY c1;
>  
> This will be achieved by Adding a new condition for setting the needNew Flag 
> in SqlImplemontor.Builder.builder() as true in case there are Aggregate 
> Expressions being passed as Group By Keys.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2741) Add operator table with Hive-specific built-in functions

2019-01-09 Thread Stamatis Zampetakis (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16738337#comment-16738337
 ] 

Stamatis Zampetakis commented on CALCITE-2741:
--

Hi [~hhlai1990],

I have the same template. Unfortunately, it does not cover everything requested 
by checkstyle so there are still things that you have to take care manually. 

However, when I see the errors of type "is not preceded with whitespace" it 
seems that you have not configured intellij correctly in order to use that 
template. Consider for instance the error "'=' is not preceded with 
whitespace". Normally if you go to your intellij settings under 
File>Settings>Editor>Code Style>Java and select the tab "Spaces" you should see 
that the "Assignment operators" checkbox is selected. 

> Add operator table with Hive-specific built-in functions
> 
>
> Key: CALCITE-2741
> URL: https://issues.apache.org/jira/browse/CALCITE-2741
> Project: Calcite
>  Issue Type: New Feature
>  Components: core
>Reporter: Lai Zhou
>Assignee: Julian Hyde
>Priority: Minor
> Attachments: 屏幕快照 2019-01-09 16.49.34.png
>
>
> [~julianhyde],
> I extended the native enummerable implemention of calcite to support Hive sql 
> ,include UDF、UDAF and all the SqlSpecialOperator,which inspired by apache 
> Drills.
> I modified the parser,type systems,and bridge the hive operator .
> How do you think of supporting a direct implemention of hive sql like this?
> I think it will be valueable when someone want to migrate his hive etl jobs 
> to real-time scene.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (CALCITE-2780) Replace UnmodifiableArrayList with JDK Functions

2019-01-09 Thread BELUGA BEHR (JIRA)


 [ 
https://issues.apache.org/jira/browse/CALCITE-2780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR closed CALCITE-2780.

Resolution: Won't Fix

Sorry for the confusion, the comment is not clear what it means to be "quick."  
Thank you for the clarification.  I'm impressed that you're able to put 
yourself back into your own shoes 5 years ago.

It seems a bit overkill to write your own classes for something so trivial, but 
I was just making a suggestion.  Will close now.

Take care.

> Replace UnmodifiableArrayList with JDK Functions
> 
>
> Key: CALCITE-2780
> URL: https://issues.apache.org/jira/browse/CALCITE-2780
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: BELUGA BEHR
>Assignee: Julian Hyde
>Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2780) Replace UnmodifiableArrayList with JDK Functions

2019-01-09 Thread Julian Hyde (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16738314#comment-16738314
 ] 

Julian Hyde commented on CALCITE-2780:
--

I wrote the comment. (Check the history of that code.)  I know what I meant. 

> Replace UnmodifiableArrayList with JDK Functions
> 
>
> Key: CALCITE-2780
> URL: https://issues.apache.org/jira/browse/CALCITE-2780
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: BELUGA BEHR
>Assignee: Julian Hyde
>Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2780) Replace UnmodifiableArrayList with JDK Functions

2019-01-09 Thread BELUGA BEHR (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16738291#comment-16738291
 ] 

BELUGA BEHR commented on CALCITE-2780:
--

[~julianhyde] I believe the word "quick" refers not to the creation of the 
wrapper itself but to the fact that the data structure returned does not make a 
copy of the underlying array.  I have updated the comment of the 
Util#unmodifiableList to reflect this and updated the PR.  Thanks.

> Replace UnmodifiableArrayList with JDK Functions
> 
>
> Key: CALCITE-2780
> URL: https://issues.apache.org/jira/browse/CALCITE-2780
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: BELUGA BEHR
>Assignee: Julian Hyde
>Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CALCITE-2777) Inclusion of Period data type in calcite.

2019-01-09 Thread Garvit (JIRA)


 [ 
https://issues.apache.org/jira/browse/CALCITE-2777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Garvit updated CALCITE-2777:

Description: 
Currently SqlTypeName does not have support for period data type in calcite. 
Need to include following data type for period data type support. :-

*

Periond (Date)
PERIOD(TIMESTAMP(n) WITH TIMEZONE)
PERIOD(TIMESTAMP (n))
PERIOD(TIME(n))
PERIOD (TIME(n) WITH TIME ZONE)

*

These data type exists in Teradata (TDv13 onwards).

Common use of these data types :-

CREATE MULTISET TABLE person_coaching_period (
  person_id    INTEGER  NOT NULL,
  coaching_program CHAR(2)  NOT NULL,
  enrolled_period  PERIOD(DATE) FORMAT '-MM-DD' NULL
)
PRIMARY INDEX (person_id);

INSERT INTO person_coaching_period
VALUES (
  1001,
  'SC',
  PERIOD(
    DATE '2010-03-01',
    DATE '2010-08-01'
  )
);
 

Usage in the query :-

SELECT person_id
FROM person_coaching_period
WHERE enrolled_period OVERLAPS
PERIOD(DATE '2006-05-01', DATE '2007-09-24');

Other operators used :-

LDIFF, OVERLAPS, BEGIN, END, EXPAND ON, By ANCHOR, NEXT, PRIOR, INTERVAL

  was:
Currently SqlTypeName does not have support for period data type in calcite. 
Need to include following data type for period data type support. :-

Periond (Date)
PERIOD(TIMESTAMP(n) WITH TIMEZONE)
PERIOD(TIMESTAMP (n))
PERIOD(TIME(n))
PERIOD (TIME(n) WITH TIME ZONE)

These data type exists in Teradata (TDv13 onwards). 

Common use of these data types :- 

CREATE MULTISET TABLE person_coaching_period (
  person_id    INTEGER  NOT NULL,
  coaching_program CHAR(2)  NOT NULL,
  enrolled_period  PERIOD(DATE) FORMAT '-MM-DD' NULL
)
PRIMARY INDEX (person_id);

INSERT INTO person_coaching_period
VALUES (
  1001,
  'SC',
  PERIOD(
    DATE '2010-03-01',
    DATE '2010-08-01'
  )
);
 

Usage in the query :-

SELECT person_id
FROM person_coaching_period
WHERE enrolled_period OVERLAPS
PERIOD(DATE '2006-05-01', DATE '2007-09-24');

Other operators used :-

LDIFF, OVERLAPS, BEGIN, END, EXPAND ON, By ANCHOR, NEXT, PRIOR, INTERVAL


> Inclusion of Period data type in calcite.
> -
>
> Key: CALCITE-2777
> URL: https://issues.apache.org/jira/browse/CALCITE-2777
> Project: Calcite
>  Issue Type: New Feature
>  Components: core
>Affects Versions: 1.18.0
>Reporter: Garvit
>Assignee: Julian Hyde
>Priority: Major
>
> Currently SqlTypeName does not have support for period data type in calcite. 
> Need to include following data type for period data type support. :-
> *
> Periond (Date)
> PERIOD(TIMESTAMP(n) WITH TIMEZONE)
> PERIOD(TIMESTAMP (n))
> PERIOD(TIME(n))
> PERIOD (TIME(n) WITH TIME ZONE)
> *
> These data type exists in Teradata (TDv13 onwards).
> Common use of these data types :-
> CREATE MULTISET TABLE person_coaching_period (
>   person_id    INTEGER  NOT NULL,
>   coaching_program CHAR(2)  NOT NULL,
>   enrolled_period  PERIOD(DATE) FORMAT '-MM-DD' NULL
> )
> PRIMARY INDEX (person_id);
> INSERT INTO person_coaching_period
> VALUES (
>   1001,
>   'SC',
>   PERIOD(
>     DATE '2010-03-01',
>     DATE '2010-08-01'
>   )
> );
>  
> Usage in the query :-
> SELECT person_id
> FROM person_coaching_period
> WHERE enrolled_period OVERLAPS
> PERIOD(DATE '2006-05-01', DATE '2007-09-24');
> Other operators used :-
> LDIFF, OVERLAPS, BEGIN, END, EXPAND ON, By ANCHOR, NEXT, PRIOR, INTERVAL



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CALCITE-2777) Inclusion of Period data type in calcite.

2019-01-09 Thread Garvit (JIRA)


 [ 
https://issues.apache.org/jira/browse/CALCITE-2777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Garvit updated CALCITE-2777:

Description: 
Currently SqlTypeName does not have support for period data type in calcite. 
Need to include following data type for period data type support. :-

Periond (Date)
PERIOD(TIMESTAMP WITH TIMEZONE)
PERIOD(TIMESTAMP )
PERIOD(TIME)
PERIOD (TIME WITH TIME ZONE)

These data type exists in Teradata (TDv13 onwards).

Common use of these data types :-

CREATE MULTISET TABLE person_coaching_period (
  person_id    INTEGER  NOT NULL,
  coaching_program CHAR(2)  NOT NULL,
  enrolled_period  PERIOD(DATE) FORMAT '-MM-DD' NULL
)
PRIMARY INDEX (person_id);

INSERT INTO person_coaching_period
VALUES (
  1001,
  'SC',
  PERIOD(
    DATE '2010-03-01',
    DATE '2010-08-01'
  )
);
 

Usage in the query :-

SELECT person_id
FROM person_coaching_period
WHERE enrolled_period OVERLAPS
PERIOD(DATE '2006-05-01', DATE '2007-09-24');

Other operators used :-

LDIFF, OVERLAPS, BEGIN, END, EXPAND ON, By ANCHOR, NEXT, PRIOR, INTERVAL

  was:
Currently SqlTypeName does not have support for period data type in calcite. 
Need to include following data type for period data type support. :-

*

Periond (Date)
PERIOD(TIMESTAMP(n) WITH TIMEZONE)
PERIOD(TIMESTAMP (n))
PERIOD(TIME(n))
PERIOD (TIME(n) WITH TIME ZONE)

*

These data type exists in Teradata (TDv13 onwards).

Common use of these data types :-

CREATE MULTISET TABLE person_coaching_period (
  person_id    INTEGER  NOT NULL,
  coaching_program CHAR(2)  NOT NULL,
  enrolled_period  PERIOD(DATE) FORMAT '-MM-DD' NULL
)
PRIMARY INDEX (person_id);

INSERT INTO person_coaching_period
VALUES (
  1001,
  'SC',
  PERIOD(
    DATE '2010-03-01',
    DATE '2010-08-01'
  )
);
 

Usage in the query :-

SELECT person_id
FROM person_coaching_period
WHERE enrolled_period OVERLAPS
PERIOD(DATE '2006-05-01', DATE '2007-09-24');

Other operators used :-

LDIFF, OVERLAPS, BEGIN, END, EXPAND ON, By ANCHOR, NEXT, PRIOR, INTERVAL


> Inclusion of Period data type in calcite.
> -
>
> Key: CALCITE-2777
> URL: https://issues.apache.org/jira/browse/CALCITE-2777
> Project: Calcite
>  Issue Type: New Feature
>  Components: core
>Affects Versions: 1.18.0
>Reporter: Garvit
>Assignee: Julian Hyde
>Priority: Major
>
> Currently SqlTypeName does not have support for period data type in calcite. 
> Need to include following data type for period data type support. :-
> Periond (Date)
> PERIOD(TIMESTAMP WITH TIMEZONE)
> PERIOD(TIMESTAMP )
> PERIOD(TIME)
> PERIOD (TIME WITH TIME ZONE)
> These data type exists in Teradata (TDv13 onwards).
> Common use of these data types :-
> CREATE MULTISET TABLE person_coaching_period (
>   person_id    INTEGER  NOT NULL,
>   coaching_program CHAR(2)  NOT NULL,
>   enrolled_period  PERIOD(DATE) FORMAT '-MM-DD' NULL
> )
> PRIMARY INDEX (person_id);
> INSERT INTO person_coaching_period
> VALUES (
>   1001,
>   'SC',
>   PERIOD(
>     DATE '2010-03-01',
>     DATE '2010-08-01'
>   )
> );
>  
> Usage in the query :-
> SELECT person_id
> FROM person_coaching_period
> WHERE enrolled_period OVERLAPS
> PERIOD(DATE '2006-05-01', DATE '2007-09-24');
> Other operators used :-
> LDIFF, OVERLAPS, BEGIN, END, EXPAND ON, By ANCHOR, NEXT, PRIOR, INTERVAL



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CALCITE-2741) Add operator table with Hive-specific built-in functions

2019-01-09 Thread Lai Zhou (JIRA)


 [ 
https://issues.apache.org/jira/browse/CALCITE-2741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lai Zhou updated CALCITE-2741:
--
Attachment: 屏幕快照 2019-01-09 16.49.34.png

> Add operator table with Hive-specific built-in functions
> 
>
> Key: CALCITE-2741
> URL: https://issues.apache.org/jira/browse/CALCITE-2741
> Project: Calcite
>  Issue Type: New Feature
>  Components: core
>Reporter: Lai Zhou
>Assignee: Julian Hyde
>Priority: Minor
> Attachments: 屏幕快照 2019-01-09 16.49.34.png
>
>
> [~julianhyde],
> I extended the native enummerable implemention of calcite to support Hive sql 
> ,include UDF、UDAF and all the SqlSpecialOperator,which inspired by apache 
> Drills.
> I modified the parser,type systems,and bridge the hive operator .
> How do you think of supporting a direct implemention of hive sql like this?
> I think it will be valueable when someone want to migrate his hive etl jobs 
> to real-time scene.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2741) Add operator table with Hive-specific built-in functions

2019-01-09 Thread Lai Zhou (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16738061#comment-16738061
 ] 

Lai Zhou commented on CALCITE-2741:
---

hi,[~zabetak],can you please give a right intellij-code-style template to me?

 

I already have one from 
[https://gist.github.com/gianm/27a4e3cad99d7b9b6513b6885d3cfcc9],

but there are still a lot of  errors when performing maven-checkstyle:

!屏幕快照 2019-01-09 16.49.34.png!

 

> Add operator table with Hive-specific built-in functions
> 
>
> Key: CALCITE-2741
> URL: https://issues.apache.org/jira/browse/CALCITE-2741
> Project: Calcite
>  Issue Type: New Feature
>  Components: core
>Reporter: Lai Zhou
>Assignee: Julian Hyde
>Priority: Minor
> Attachments: 屏幕快照 2019-01-09 16.49.34.png
>
>
> [~julianhyde],
> I extended the native enummerable implemention of calcite to support Hive sql 
> ,include UDF、UDAF and all the SqlSpecialOperator,which inspired by apache 
> Drills.
> I modified the parser,type systems,and bridge the hive operator .
> How do you think of supporting a direct implemention of hive sql like this?
> I think it will be valueable when someone want to migrate his hive etl jobs 
> to real-time scene.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (CALCITE-2741) Add operator table with Hive-specific built-in functions

2019-01-09 Thread Lai Zhou (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16738061#comment-16738061
 ] 

Lai Zhou edited comment on CALCITE-2741 at 1/9/19 10:10 AM:


hi,[~zabetak],can you please give a right intellij-code-style template to me?

I already have one from 
[https://gist.github.com/gianm/27a4e3cad99d7b9b6513b6885d3cfcc9],

but there are still a lot of  errors when performing maven-checkstyle:
{code:java}
[ERROR] 
/Users/zhoulai/Downloads/big-data/calcite/core/src/main/java/org/apache/calcite/util/Util.java:2105:13:
 ';' is followed by whitespace. [EmptyForIteratorPad] [ERROR] 
/Users/zhoulai/Downloads/big-data/calcite/core/src/main/java/org/apache/calcite/util/Util.java:2110:13:
 ';' is preceded with whitespace. [NoWhitespaceBefore] [ERROR] 
/Users/zhoulai/Downloads/big-data/calcite/core/src/main/java/org/apache/calcite/util/Util.java:2110:15:
 ';' is followed by whitespace. [EmptyForIteratorPad] [ERROR] 
/Users/zhoulai/Downloads/big-data/calcite/core/src/main/java/org/apache/calcite/util/Util.java:2361:
 'toImmutableList' have incorrect indentation level 2, expected level should be 
6. [Indentation] [ERROR] 
/Users/zhoulai/Downloads/big-data/calcite/core/src/main/java/org/apache/calcite/config/CalciteConnectionConfigImpl.java:121:
 Line is longer than 100 characters (found 105). [LineLength] [ERROR] 
/Users/zhoulai/Downloads/big-data/calcite/core/src/main/java/org/apache/calcite/config/CalciteConnectionConfigImpl.java:121:31:
 '=' is not preceded with whitespace. [WhitespaceAround] [ERROR] 
/Users/zhoulai/Downloads/big-data/calcite/core/src/main/java/org/apache/calcite/config/CalciteConnectionConfigImpl.java:121:32:
 '=' is not followed by whitespace. [WhitespaceAround] [ERROR] 
/Users/zhoulai/Downloads/big-data/calcite/core/src/main/java/org/apache/calcite/config/CalciteConnectionConfigImpl.java:122:30:
 '=' is not preceded with whitespace. [WhitespaceAround] [ERROR] 
/Users/zhoulai/Downloads/big-data/calcite/core/src/main/java/org/apache/calcite/config/CalciteConnectionConfigImpl.java:122:31:
 '=' is not followed by whitespace. [WhitespaceAround] [ERROR] 
/Users/zhoulai/Downloads/big-data/calcite/core/src/main/java/org/apache/calcite/config/CalciteConnectionConfigImpl.java:123:24:
 '=' is not preceded with whitespace. [WhitespaceAround] [ERROR] 
/Users/zhoulai/Downloads/big-data/calcite/core/src/main/java/org/apache/calcite/config/CalciteConnectionConfigImpl.java:126:8:
 'catch' is not preceded with whitespace. [WhitespaceAround] [ERROR] 
/Users/zhoulai/Downloads/big-data/calcite/core/src/main/java/org/apache/calcite/config/CalciteConnectionConfigImpl.java:126:8:
 '}' is not followed by whitespace. [WhitespaceAround] [ERROR] 
/Users/zhoulai/Downloads/big-data/calcite/core/src/main/java/org/apache/calcite/config/CalciteConnectionConfigImpl.java:126:28:
 '{' is not preceded with whitespace. [WhitespaceAround] [ERROR] 
/Users/zhoulai/Downloads/big-data/calcite/core/src/main/java/org/apache/calcite/config/CalciteConnectionConfigImpl.java:127:
 Line is longer than 100 characters (found 132). [LineLength] [ERROR] 
/Users/zhoulai/Downloads/big-data/calcite/core/src/main/java/org/apache/calcite/config/CalciteConnectionConfigImpl.java:127:127:
 ',' is preceded with whitespace. [NoWhitespaceBefore] [ERROR] 
/Users/zhoulai/Downloads/big-data/calcite/core/src/main/java/org/apache/calcite/config/CalciteConnectionConfigImpl.java:127:129:
 ',' is not followed by whitespace. [WhitespaceAfter] [ERROR] 
/Users/zhoulai/Downloads/big-data/calcite/core/src/main/java/org/apache/calcite/runtime/SqlFunctions.java:212:13:
 ';' is preceded with whitespace. [NoWhitespaceBefore] [ERROR] 
/Users/zhoulai/Downloads/big-data/calcite/core/src/main/java/org/apache/calcite/runtime/SqlFunctions.java:212:15:
 ';' is followed by whitespace. [EmptyForIteratorPad] [
{code}
 


was (Author: hhlai1990):
hi,[~zabetak],can you please give a right intellij-code-style template to me?

 

I already have one from 
[https://gist.github.com/gianm/27a4e3cad99d7b9b6513b6885d3cfcc9],

but there are still a lot of  errors when performing maven-checkstyle:

!屏幕快照 2019-01-09 16.49.34.png!

 

> Add operator table with Hive-specific built-in functions
> 
>
> Key: CALCITE-2741
> URL: https://issues.apache.org/jira/browse/CALCITE-2741
> Project: Calcite
>  Issue Type: New Feature
>  Components: core
>Reporter: Lai Zhou
>Assignee: Julian Hyde
>Priority: Minor
> Attachments: 屏幕快照 2019-01-09 16.49.34.png
>
>
> [~julianhyde],
> I extended the native enummerable implemention of calcite to support Hive sql 
> ,include UDF、UDAF and all the SqlSpecialOperator,which inspired by apache 
> Drills.
> I modified the parser,type systems,and bridge the hive operator .
> How do you think of