Re: [DISCUSS] Make Enumerable operators responsive to interrupts

2019-10-17 Thread Danny Chan
I also have this concern, a full build interrupts are awesome for hang on 
/canceling the job, but it’s really hard to get it completely right, sometimes 
confuses a lot for some code that should not care about these interrupts 
signals.

Best,
Danny Chan
在 2019年10月17日 +0800 PM5:47,Vladimir Sitnikov ,写道:
> Roman Elizarov raises valid points re 'interrupts are too hard (or even
> impossible) to get right':
> https://twitter.com/relizarov/status/1184460504238100480
>
> Vladimir


Re: [DISCUSSION] Extension of Metadata Query

2019-10-17 Thread Danny Chan
That is the point, we should supply a way to extend the RelMetadataQuery 
conveniently for Calcite, because in most of the RelOptRules, user would use 
the code like:

RelOptRuleCall.getMetadataQuery

To get a RMQ instead of using AbstractRelNode.metadata() to fetch a 
MedataFactory.

We should at lest unity the metadata query entrance/interfaces, or it would 
confuse a lot.

Best,
Danny Chan
在 2019年10月18日 +0800 AM12:23,Seliverstov Igor ,写道:
> At least in our project (Apache Ignite) we use AbstractRelNode.metadata().
>
> But it so because there is no way to put our metadata type into
> RelMetadataQuery without changes in Calcite.
>
> Regards,
> Igor
>
> чт, 17 окт. 2019 г., 19:16 Xiening Dai :
>
> > MetadataFactory is still useful. It provides a way to access Metadata
> > directly. If someone creates a new type of Metadata class, it can be
> > accessed through AbstractRelNode.metadata(). This way you don’t need to
> > update RelMetadataQuery interface to include the getter for this new meta.
> > Although I don’t see this pattern being used often, but I do think it is
> > still useful and shouldn’t be removed.
> >
> >
> > For your second point, I think you would still need a way to keep
> > RelMetadataQuery object during a rule call. If you choose to create new
> > instance, you will have to pass it around while applying the rule. That
> > actually complicates things a lot.
> >
> >
> > > On Oct 17, 2019, at 12:49 AM, XING JIN  wrote:
> > >
> > > 1. RelMetadataQuery covers the functionality of MetadataFactory, why
> > should
> > > we keep/maintain both of them ? shall we just deprecate MetadataFactory.
> > I
> > > see MetadataFactory is rarely used in current code. Also I
> > > think MetadataFactory is not good place to offering customized metadata,
> > > which will make user confused for the difference between RelMetadataQuery
> > > and MetadataFactory.
> > >
> > > > Customized RelMetadataQuery with code generated meta handler for
> > > customized metadata, also can provide convenient way to get metadata.
> > > It makes sense for me.
> > >
> > > 2. If the natural lifespan of a RelMetadataQuery is a RelOptCall, shall
> > we
> > > deprecate RelOptCluster#getMetadataQuery ? If a user wants to get the
> > > metadata but without a RelOptCall, he/she will need to create a new
> > > instance of RelMetadataQuery.
> > >
> > > Xiening Dai  于2019年10月17日周四 上午2:27写道:
> > >
> > > > I have seen both patterns in current code base. In most places, for
> > > > example SubQueryRemoveRule, AggregateUnionTrasposeRule
> > > > SortJoinTransposeRule, etc., RelOptCluster.getMetadataQuery() is used.
> > And
> > > > there are a few other places where new RelMetadataQuery instance is
> > > > created, which Haisheng attempts to fix.
> > > >
> > > > Currently RelOptCluster.invalidateMetadataQuery() is called at the end
> > of
> > > > RelOptRuleCall.transformTo(). So the lifespan of RelMetadataQuery is
> > > > guaranteed to be within a RelOptCall. I think Haisheng’s fix is safe.
> > > >
> > > >
> > > > > On Oct 16, 2019, at 1:53 AM, Danny Chan  wrote:
> > > > >
> > > > > This is the reason I was struggling for the discussion.
> > > > >
> > > > > Best,
> > > > > Danny Chan
> > > > > 在 2019年10月16日 +0800 AM11:23,dev@calcite.apache.org,写道:
> > > > > >
> > > > > > RelMetadataQuery
> > > >
> > > >
> >
> >


[DISCUSS] On-demand traitset request

2019-10-17 Thread Haisheng Yuan
TL;DR
Both top-down physical TraitSet request and bottom-up TraitSet
derivation have their strongth and weakness, we propose 
on-demand TraitSet request to combine the above two, to reduce
the number of plan alternatives that are genereated, especially 
in distributed system.

e.g.
select * from foo join bar on f1=b1 and f2=b2 and f3=b3;

In non-distributed system, we can generate a sort merge join, 
requesting foo sorted by f1,f2,f3 and bar sorted by b1,b2,b3.  
But if foo happens to be sorted by f3,f2,f1, we may miss the 
chance of making use of the delivered ordering of foo. Because
if we require bar to be sorted by b3,b2,b1, we don't need to
sort on foo anymore. There are so many choices, n!, not even
considering asc/desc and null direction. We can't request all
the possible traitsets in top-down way, and can't derive all the
possible traitsets in bottom-up way either.

We propose on-demand traitset request by adding a new type
of metadata DerivedTraitSets into the built-in metadata system.

List deriveTraitSets(RelNode, RelMetadataQuery)

In this metadata, every operator returns several possbile traitsets
that may be derived from this operator.

Using above query as an example, the tablescan on foo should
return traiset with collation on f3, f2, f1.

In physical implementation rules, e.g. the SortMergeJoinRule,
it gets possible traitsets from both child operators, uses the join
keys to eliminate useless traitsets, leaves out usefull traitsets,
and requests corresponding traitset on the other child.

This relies on the feature of AbstractConverter, which is turned
off by default, due to performance issue [1].

Thoughts?

[1] https://issues.apache.org/jira/browse/CALCITE-2970

Haisheng



[jira] [Created] (CALCITE-3427) some subquery correlated case isn't fully implemented

2019-10-17 Thread liuzonghao (Jira)
liuzonghao created CALCITE-3427:
---

 Summary: some subquery correlated case isn't fully implemented
 Key: CALCITE-3427
 URL: https://issues.apache.org/jira/browse/CALCITE-3427
 Project: Calcite
  Issue Type: Bug
  Components: core
Affects Versions: 1.21.0
Reporter: liuzonghao
 Fix For: next


{code:java}
//代码占位符
// for correlated case queries such as
//
// select e.deptno, e.deptno < some (
//   select deptno from emp where emp.name = e.name) as v
// from emp as e
//
// becomes
//
// select e.deptno,
//   case
//   when indicator is null then false // sub-query is empty for corresponding 
corr value
//   when q.c = 0 then false // sub-query is empty
//   when (e.deptno < q.m) is true then true
//   when q.c > q.d then unknown // sub-query has at least one null
//   else e.deptno < q.m
//   end as v
// from emp as e
// left outer join (
//   select name, max(deptno) as m, count(*) as c, count(deptno) as d,
//   "alwaysTrue" as indicator
//   from emp group by name) as q on e.name = q.name
Set varsUsed = RelOptUtil.getVariablesUsed(e.rel);
builder.push(e.rel)
.aggregate(builder.groupKey(),
builder.aggregateCall(minMax, builder.field(0)).as("m"),
builder.count(false, "c"),
builder.count(false, "d", builder.field(0)));

final List parentQueryFields = new ArrayList<>();
parentQueryFields.addAll(builder.fields());
String indicator = "trueLiteral";
parentQueryFields.add(builder.alias(literalTrue, indicator));
builder.project(parentQueryFields).as("q");
builder.join(JoinRelType.LEFT, literalTrue, variablesSet);
caseRexNode = builder.call(SqlStdOperatorTable.CASE,
builder.call(SqlStdOperatorTable.IS_NULL,
builder.field("q", indicator)),
literalFalse,
builder.call(SqlStdOperatorTable.EQUALS, builder.field("q", "c"),
builder.literal(0)),
literalFalse,
builder.call(SqlStdOperatorTable.IS_TRUE,
builder.call(RelOptUtil.op(op.comparisonKind, null),
e.operands.get(0), builder.field("q", "m"))),
literalTrue,
builder.call(SqlStdOperatorTable.GREATER_THAN,
builder.field("q", "c"), builder.field("q", "d")),
literalUnknown,
builder.call(RelOptUtil.op(op.comparisonKind, null),
e.operands.get(0), builder.field("q", "m")));
{code}
implementation code is in SubQueryRemoveRule.rewriteSome method. the groupKey 
and join condition is lacked.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CALCITE-3426) compensate validConstant type in RexLiteral.

2019-10-17 Thread xzh_dz (Jira)
xzh_dz created CALCITE-3426:
---

 Summary: compensate validConstant type in RexLiteral.
 Key: CALCITE-3426
 URL: https://issues.apache.org/jira/browse/CALCITE-3426
 Project: Calcite
  Issue Type: Wish
Reporter: xzh_dz


compensate validConstant type in RexLiteral.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: Apache Calcite meetup group

2019-10-17 Thread Julian Hyde
I would be delighted if you would do that - thank you!

If you are organizing a meet up, please consult with this list. There may be 
people who would be willing to speak and have something interesting to say.

Julian

> On Oct 16, 2019, at 3:14 PM, Jesus Camacho Rodriguez  
> wrote:
> 
> Hi Julian,
> 
> I have just seen your message. Although we have other ways to communicate,
> I believe it may be valuable to keep the group even if a meetup has not
> happened for a while (we may organize some meetups in the future, those
> interested in Calcite may be subscribed to group to attend talks around the
> project even if they do not follow the project closely through mailing
> list, etc.). I would be happy to pay the fees to keep it. I have just
> checked and I think I could simply go ahead and pay, but let me know if I
> need to do anything else.
> 
> Thanks,
> Jesús
> 
> 
> On Wed, Oct 16, 2019 at 11:21 AM Julian Hyde  wrote:
> 
>> If you’re a member of the Apache Calcite meetup group[1], you probably
>> just received an email saying that the group is shutting down. I set it up
>> a few years ago, but I never find time to organize meetups, so I decided to
>> stop paying the annual fee to meetup.com .
>> 
>> I’m not particularly sad that it’s closing down, given that it has been
>> inactive, and we as a community seem to find other ways to talk to each
>> other. But if someone in the community would like to organize some meetups
>> and is prepared to pay the fees, I’m happy to hand over the reins.
>> 
>> Julian
>> 
>> [1] https://www.meetup.com/Apache-Calcite/ <
>> https://www.meetup.com/Apache-Calcite/>



[jira] [Created] (CALCITE-3425) Inconsistent behavior of MetadataProvider in RelOptCluster

2019-10-17 Thread Haisheng Yuan (Jira)
Haisheng Yuan created CALCITE-3425:
--

 Summary: Inconsistent behavior of MetadataProvider in RelOptCluster
 Key: CALCITE-3425
 URL: https://issues.apache.org/jira/browse/CALCITE-3425
 Project: Calcite
  Issue Type: Improvement
  Components: core
Reporter: Haisheng Yuan


To use customized metadata provider, we can do the following:

{code:java}
RelMetadataQuery.THREAD_PROVIDERS.set(
  JaninoRelMetadataProvider.of(xxxmetadataProvider));
{code}

It only works for builtin metadata type, but for customized metadata, we still 
get exception when retrieve the metadata using reflection. Because when the 
RelOptCluster is created, it always use the default metadata provider, instead 
of the customized one.

{code:java}
setMetadataProvider(DefaultRelMetadataProvider.INSTANCE);
{code}

It causes confusing. We have to set the provider in 2 places. Should we unify 
them in a single place?




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] Support Sql Hint for Calcite

2019-10-17 Thread Julian Hyde
I wonder whether it is possible to add some kind of “action handler” to the 
planner engine, called, for example, when a rule has fired and is registering 
the RelNode created by the rule. People can write their own action handlers to 
copy hints around. Since the action handlers are the user’s code, they can 
iterate faster to find a hint-propagation strategy that works in practice.

Another idea is to use VolcanoPlanner.Provenance[1]. A RelNode can find its 
ancestor RelNodes, and the rules that fired to create it. So it can grab hints 
from those ancestors. It does not need to copy those hints onto itself.

Julian

[1] 
https://calcite.apache.org/apidocs/org/apache/calcite/plan/volcano/VolcanoPlanner.Provenance.html
 


> On Oct 16, 2019, at 8:38 PM, Haisheng Yuan  wrote:
> 
> Julian,
> Your concern is very valid, and that is also our main concern.
> I was thinking whether we can put hint into the MEMO group, so that both 
> logical and physical expression in the same group can share the same hint, 
> without copying the hint explicitly. But for newly generated expression that 
> doesn't belong to the original group, we still need to copy hints. What's 
> worse, in HepPlanner, there is no such concept, we may still need to copy 
> hints explicity in planner rules, if we want to keep the hint, which is 
> burdensome.
> 
> - Haisheng
> 
> --
> 发件人:Danny Chan
> 日 期:2019年10月16日 14:54:46
> 收件人:
> 主 题:Re: [DISCUSS] Support Sql Hint for Calcite
> 
> Thanks for the clarification.
> 
> I understand you worried. Yes, the effort/memory would be wasted or 
> meaningless if hints are not used. This is just what a hint does, it is a 
> “hint” and non-mandatory, but we should give the chance to let user see them, 
> it is the use that decide if to use the hints and how to use them. For big 
> queries I have no confidence to cover the corner cases. So can we mark this 
> feature as experimental and used for simple queries(no decorrelation) first ?
> 
> For “reversible”, during the implementation, I try to make the modifications 
> non-invasive with the current codes. That is why I made all the interfaces 
> about the hint into one class named RelWithHInt. Different with trait, I 
> didn’t force users to pass in the hints in the RelNode constructor. I think 
> if is not a bigwork if we want to remove the API.
> 
> Best,
> Danny Chan
> 在 2019年10月16日 +0800 AM11:14,Julian Hyde ,写道:
>> By “skeptical” I mean that I think we can come up with a mechanism to copy 
>> hints when applying planner rules, but even when we have implemented that 
>> mechanism there will be many cases where people want a hint and that hint is 
>> not copied to the RelNode where it is needed, and many other cases where we 
>> spend the effort/memory of copying the hint to a RelNode and the hint is not 
>> used.
>> 
>> By “reversible” I mean if we come up with an API that does not work, how do 
>> we change or remove that API without people complaining?
>> 
>> Julian
>> 
>> 
>>> On Oct 15, 2019, at 7:11 PM, Danny Chan  wrote:
>>> 
>>> Thanks Julian
>>> 
 I am skeptical that RelWithHint will work for large queries.
>>> 
>>> For “skeptical” do you mean how to transfer the hints during rule planning 
>>> ? I’m also not that confident yet.
>>> 
 How do we introduce it in a reversible way
>>> Do you mean transform the RelWithHint back into the SqlHint ? I didn’t 
>>> implement it in current patch, but I think we have the ability to do that 
>>> because we have a inheritPath for each RelWithHint, we can collect all the 
>>> hints together and merge them into the SqlHints, then propagate these 
>>> SqlHints to the SqlNodes.
>>> 
 What are the other options?
>>> Do you mean the way to transfer hints during planning ? I have no other 
>>> options yet.
>>> 
>>> Best,
>>> Danny Chan
>>> 在 2019年10月16日 +0800 AM8:03,dev@calcite.apache.org,写道:
 
 I am skeptical that RelWithHint will work for large queries.
>> 
> 



Re: [DISCUSS] Make Avatica more discoverable

2019-10-17 Thread Julian Hyde
Many people who are interested in Avatica are not interested in Calcite. (Yes 
the Board are interested in both, because they are interested in the 
communities, which overlap. But users of Avatica, not so much. And if people 
perceive that Avatica requires Calcite they might be less likely to adopt it.)

I think Avatica would be more discoverable if it was hosted at 
https://avatica.apache.org , rather than 
https://calcite.apache.org/avatica/ .

I think the brief mention of Avatica on Calcite’s home page is adequate. 
Perhaps there should be a section on Avatica in 
https://calcite.apache.org/community/ .

In other words, it’s time to move Avatica a little further along its evolution 
from module to sub-project to independent project.

Julian


> On Oct 17, 2019, at 6:54 AM, Michael Mior  wrote:
> 
> Since there's only one sub-project, why don't we convert the
> Sub-Projects section on the homepage into a short description of
> Avatica?
> --
> Michael Mior
> mm...@apache.org
> 
> Le jeu. 17 oct. 2019 à 06:19, Francis Chuang
>  a écrit :
>> 
>> This was one of the comments on the October Board report for Calcite:
>> 
>>   df: Great progress!
>> 
>>   About Avatica - I had to google to find the subproject as I did
>>   not see anything obvious on the Calcite site. I would also be
>>   good to provide more status on the subproject in your future
>>   reports.
>> 
>> The Avatica sub-project is currently linked on Calcite's homepage, but
>> it is not very noticeable or discoverable.
>> 
>> Any thoughts on how we can improve the visibility of Avatica?
>> 
>> Francis



Re: Re: [DISCUSSION] Extension of Metadata Query

2019-10-17 Thread Haisheng Yuan
MetadataFactory's implementation provides a unified, reflective approach to 
retrieve metadata, no matter the metadata is builtin or extended type. 
RelMetadataQuery provides convenient methods (in codegen) to get builtin 
metadata, but not for costomized metadata. Unless we recommend sub-classing 
RelMetadataQuery as the way to add costomized metadata, deprecating 
MetadataFactory is not a wise choice.

- Haisheng

--
发件人:XING JIN
日 期:2019年10月17日 15:55:31
收件人:
主 题:Re: [DISCUSSION] Extension of Metadata Query

BTW, I think one JIRA number discussed in the thread would be
https://issues.apache.org/jira/browse/CALCITE-2855 not CALCITE-2885

Best,
Jin

XING JIN  于2019年10月17日周四 下午3:49写道:

> 1. RelMetadataQuery covers the functionality of MetadataFactory, why
> should we keep/maintain both of them ? shall we just
> deprecate MetadataFactory. I see MetadataFactory is rarely used in current
> code. Also I think MetadataFactory is not good place to offering customized
> metadata, which will make user confused for the difference
> between RelMetadataQuery and MetadataFactory.
>
> > Customized RelMetadataQuery with code generated meta handler for
> customized metadata, also can provide convenient way to get metadata.
> It makes sense for me.
>
> 2. If the natural lifespan of a RelMetadataQuery is a RelOptCall, shall we
> deprecate RelOptCluster#getMetadataQuery ? If a user wants to get the
> metadata but without a RelOptCall, he/she will need to create a new
> instance of RelMetadataQuery.
>
> Xiening Dai  于2019年10月17日周四 上午2:27写道:
>
>> I have seen both patterns in current code base. In most places, for
>> example SubQueryRemoveRule, AggregateUnionTrasposeRule
>> SortJoinTransposeRule, etc., RelOptCluster.getMetadataQuery() is used. And
>> there are a few other places where new RelMetadataQuery instance is
>> created, which Haisheng attempts to fix.
>>
>> Currently RelOptCluster.invalidateMetadataQuery() is called at the end of
>> RelOptRuleCall.transformTo(). So the lifespan of RelMetadataQuery is
>> guaranteed to be within a RelOptCall. I think Haisheng’s fix is safe.
>>
>>
>> > On Oct 16, 2019, at 1:53 AM, Danny Chan  wrote:
>> >
>> > This is the reason I was struggling for the discussion.
>> >
>> > Best,
>> > Danny Chan
>> > 在 2019年10月16日 +0800 AM11:23,dev@calcite.apache.org,写道:
>> >>
>> >> RelMetadataQuery
>>
>>



[jira] [Created] (CALCITE-3424) AssertionError thrown for user-defined table function with array argument

2019-10-17 Thread Igor Guzenko (Jira)
Igor Guzenko created CALCITE-3424:
-

 Summary: AssertionError thrown for user-defined table function 
with array argument
 Key: CALCITE-3424
 URL: https://issues.apache.org/jira/browse/CALCITE-3424
 Project: Calcite
  Issue Type: Bug
  Components: core
Affects Versions: 1.21.0
Reporter: Igor Guzenko
Assignee: Igor Guzenko


*Steps to reproduce:*

*1.* Add method with list parameter to Smalls.java
{code:java}
  public static final Method GENERATE_STRINGS_2_METHOD =
  Types.lookupMethod(Smalls.class, "generateStrings2", List.class);
  public static QueryableTable generateStrings2(final List list) {
return generateStrings(list.size());
  }
{code}
*2.* Add test method which uses new user-defined table function to 
TableFunctionTest.java
{code:java}
  @Test public void testTableFunction2() throws SQLException {
try (Connection connection = DriverManager.getConnection("jdbc:calcite:")) {
  CalciteConnection calciteConnection =
  connection.unwrap(CalciteConnection.class);
  SchemaPlus rootSchema = calciteConnection.getRootSchema();
  SchemaPlus schema = rootSchema.add("s", new AbstractSchema());
  final TableFunction table =
  TableFunctionImpl.create(Smalls.GENERATE_STRINGS_2_METHOD);
  schema.add("GenerateStrings2", table);
  final String sql = "select *\n"
  + "from table(\"s\".\"GenerateStrings2\"(5,4,3,1,2)) as t(n, c)\n"
  + "where char_length(c) > 3";
  ResultSet resultSet = connection.createStatement().executeQuery(sql);
  assertThat(CalciteAssert.toString(resultSet),
  equalTo("N=4; C=abcd\n"));
}
  }
{code}
Execution result produced by such test method is the following stack trace:
{code:none}
java.lang.AssertionError: use createArrayType() instead

at 
org.apache.calcite.sql.type.SqlTypeFactoryImpl.assertBasic(SqlTypeFactoryImpl.java:221)
at 
org.apache.calcite.sql.type.SqlTypeFactoryImpl.createSqlType(SqlTypeFactoryImpl.java:48)
at 
org.apache.calcite.jdbc.JavaTypeFactoryImpl.toSql(JavaTypeFactoryImpl.java:255)
at 
org.apache.calcite.prepare.CalciteCatalogReader.toSql(CalciteCatalogReader.java:381)
at 
org.apache.calcite.prepare.CalciteCatalogReader.lambda$toSql$7(CalciteCatalogReader.java:370)
at 
com.google.common.collect.Lists$TransformingRandomAccessList$1.transform(Lists.java:640)
at 
com.google.common.collect.TransformedIterator.next(TransformedIterator.java:48)
at java.util.AbstractCollection.toArray(AbstractCollection.java:141)
at 
com.google.common.collect.ImmutableList.copyOf(ImmutableList.java:239)
at org.apache.calcite.sql.SqlFunction.(SqlFunction.java:123)
at 
org.apache.calcite.sql.validate.SqlUserDefinedFunction.(SqlUserDefinedFunction.java:63)
at 
org.apache.calcite.sql.validate.SqlUserDefinedTableFunction.(SqlUserDefinedTableFunction.java:45)
at 
org.apache.calcite.prepare.CalciteCatalogReader.toOp(CalciteCatalogReader.java:338)
at 
org.apache.calcite.prepare.CalciteCatalogReader.toOp(CalciteCatalogReader.java:302)
at 
org.apache.calcite.prepare.CalciteCatalogReader.lambda$lookupOperatorOverloads$3(CalciteCatalogReader.java:271)
at 
java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
at 
java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175)
at 
java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1382)
at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)
at 
java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
at 
java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)
at 
java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)
at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
at 
java.util.stream.ReferencePipeline.forEachOrdered(ReferencePipeline.java:490)
at 
org.apache.calcite.prepare.CalciteCatalogReader.lookupOperatorOverloads(CalciteCatalogReader.java:272)
at 
org.apache.calcite.sql.util.ChainedSqlOperatorTable.lookupOperatorOverloads(ChainedSqlOperatorTable.java:73)
at 
org.apache.calcite.sql.validate.SqlValidatorImpl.performUnconditionalRewrites(SqlValidatorImpl.java:1195)
at 
org.apache.calcite.sql.validate.SqlValidatorImpl.performUnconditionalRewrites(SqlValidatorImpl.java:1180)
at 
org.apache.calcite.sql.validate.SqlValidatorImpl.performUnconditionalRewrites(SqlValidatorImpl.java:1180)
at 
org.apache.calcite.sql.validate.SqlValidatorImpl.performUnconditionalRewrites(SqlValidatorImpl.java:1180)
at 
org.apache.calcite.sql.validate.SqlValidatorImpl.validateScopedExpression(SqlValidatorImpl.java:937)
at 

Re: [DISCUSSION] Extension of Metadata Query

2019-10-17 Thread Seliverstov Igor
At least in our project (Apache Ignite) we use AbstractRelNode.metadata().

But it so because there is no way to put our metadata type into
RelMetadataQuery without changes in Calcite.

Regards,
Igor

чт, 17 окт. 2019 г., 19:16 Xiening Dai :

> MetadataFactory is still useful. It provides a way to access Metadata
> directly. If someone creates a new type of Metadata class, it can be
> accessed through AbstractRelNode.metadata(). This way you don’t need to
> update RelMetadataQuery interface to include the getter for this new meta.
> Although I don’t see this pattern being used often, but I do think it is
> still useful and shouldn’t be removed.
>
>
> For your second point, I think you would still need a way to keep
> RelMetadataQuery object during a rule call. If you choose to create new
> instance, you will have to pass it around while applying the rule. That
> actually complicates things a lot.
>
>
> > On Oct 17, 2019, at 12:49 AM, XING JIN  wrote:
> >
> > 1. RelMetadataQuery covers the functionality of MetadataFactory, why
> should
> > we keep/maintain both of them ? shall we just deprecate MetadataFactory.
> I
> > see MetadataFactory is rarely used in current code. Also I
> > think MetadataFactory is not good place to offering customized metadata,
> > which will make user confused for the difference between RelMetadataQuery
> > and MetadataFactory.
> >
> >> Customized RelMetadataQuery with code generated meta handler for
> > customized metadata, also can provide convenient way to get metadata.
> > It makes sense for me.
> >
> > 2. If the natural lifespan of a RelMetadataQuery is a RelOptCall, shall
> we
> > deprecate RelOptCluster#getMetadataQuery ? If a user wants to get the
> > metadata but without a RelOptCall, he/she will need to create a new
> > instance of RelMetadataQuery.
> >
> > Xiening Dai  于2019年10月17日周四 上午2:27写道:
> >
> >> I have seen both patterns in current code base. In most places, for
> >> example SubQueryRemoveRule, AggregateUnionTrasposeRule
> >> SortJoinTransposeRule, etc., RelOptCluster.getMetadataQuery() is used.
> And
> >> there are a few other places where new RelMetadataQuery instance is
> >> created, which Haisheng attempts to fix.
> >>
> >> Currently RelOptCluster.invalidateMetadataQuery() is called at the end
> of
> >> RelOptRuleCall.transformTo(). So the lifespan of RelMetadataQuery is
> >> guaranteed to be within a RelOptCall. I think Haisheng’s fix is safe.
> >>
> >>
> >>> On Oct 16, 2019, at 1:53 AM, Danny Chan  wrote:
> >>>
> >>> This is the reason I was struggling for the discussion.
> >>>
> >>> Best,
> >>> Danny Chan
> >>> 在 2019年10月16日 +0800 AM11:23,dev@calcite.apache.org,写道:
> 
>  RelMetadataQuery
> >>
> >>
>
>


Re: [DISCUSSION] Extension of Metadata Query

2019-10-17 Thread Xiening Dai
MetadataFactory is still useful. It provides a way to access Metadata directly. 
If someone creates a new type of Metadata class, it can be accessed through 
AbstractRelNode.metadata(). This way you don’t need to update RelMetadataQuery 
interface to include the getter for this new meta. Although I don’t see this 
pattern being used often, but I do think it is still useful and shouldn’t be 
removed.


For your second point, I think you would still need a way to keep 
RelMetadataQuery object during a rule call. If you choose to create new 
instance, you will have to pass it around while applying the rule. That 
actually complicates things a lot. 


> On Oct 17, 2019, at 12:49 AM, XING JIN  wrote:
> 
> 1. RelMetadataQuery covers the functionality of MetadataFactory, why should
> we keep/maintain both of them ? shall we just deprecate MetadataFactory. I
> see MetadataFactory is rarely used in current code. Also I
> think MetadataFactory is not good place to offering customized metadata,
> which will make user confused for the difference between RelMetadataQuery
> and MetadataFactory.
> 
>> Customized RelMetadataQuery with code generated meta handler for
> customized metadata, also can provide convenient way to get metadata.
> It makes sense for me.
> 
> 2. If the natural lifespan of a RelMetadataQuery is a RelOptCall, shall we
> deprecate RelOptCluster#getMetadataQuery ? If a user wants to get the
> metadata but without a RelOptCall, he/she will need to create a new
> instance of RelMetadataQuery.
> 
> Xiening Dai  于2019年10月17日周四 上午2:27写道:
> 
>> I have seen both patterns in current code base. In most places, for
>> example SubQueryRemoveRule, AggregateUnionTrasposeRule
>> SortJoinTransposeRule, etc., RelOptCluster.getMetadataQuery() is used. And
>> there are a few other places where new RelMetadataQuery instance is
>> created, which Haisheng attempts to fix.
>> 
>> Currently RelOptCluster.invalidateMetadataQuery() is called at the end of
>> RelOptRuleCall.transformTo(). So the lifespan of RelMetadataQuery is
>> guaranteed to be within a RelOptCall. I think Haisheng’s fix is safe.
>> 
>> 
>>> On Oct 16, 2019, at 1:53 AM, Danny Chan  wrote:
>>> 
>>> This is the reason I was struggling for the discussion.
>>> 
>>> Best,
>>> Danny Chan
>>> 在 2019年10月16日 +0800 AM11:23,dev@calcite.apache.org,写道:
 
 RelMetadataQuery
>> 
>> 



Re: [DISCUSS] Make Avatica more discoverable

2019-10-17 Thread Michael Mior
Since there's only one sub-project, why don't we convert the
Sub-Projects section on the homepage into a short description of
Avatica?
--
Michael Mior
mm...@apache.org

Le jeu. 17 oct. 2019 à 06:19, Francis Chuang
 a écrit :
>
> This was one of the comments on the October Board report for Calcite:
>
>df: Great progress!
>
>About Avatica - I had to google to find the subproject as I did
>not see anything obvious on the Calcite site. I would also be
>good to provide more status on the subproject in your future
>reports.
>
> The Avatica sub-project is currently linked on Calcite's homepage, but
> it is not very noticeable or discoverable.
>
> Any thoughts on how we can improve the visibility of Avatica?
>
> Francis


Re: CassandraAdapter (Add Type) and WHERE statement.

2019-10-17 Thread Michael Mior
Perhaps I'm missing something, but I don't see why this would be any
more efficient. Selecting all data is also not an efficient operation
in Cassandra. Using ALLOW FILTERING will likely be more efficient
since it's basically the same as doing a table scan, but it avoids
returning data which would later be filtered by Calcite anyway.
--
Michael Mior
mm...@apache.org

Le jeu. 17 oct. 2019 à 09:13, Yanna elina  a écrit :
>
> Thank for reply Michael.
>
> yes i understood  this on the documentation for example with "WHERE"
> statement   calcite i  force the . "ALLOW FILTERING; "
> and this can be expensive.
>
>  I think there may be an interesting approach using STREAM.
>
> for example maintain a regular update between a cassandra TABLE and a
> STREAM TABLE.
>
> CASSANDRA_TABLE_A .(SELECT * FROM TABLE_A) > STREAM_TABLE_A .
> SELECT STREAM * FROM STREAM_TABLE_A WHERE username = 'JmuhsAaMdw'
>
> i guess it will be more efficient to directly make the WHERE from the
> STREAM than the cassandra_adapter  using "allow filtering"
> a synchronization strategy can be set up between the cassandra table and
> the STREAM table
> what is your opinion about this approach ?
> Thanks !
> Yana
>
>
> Le mer. 16 oct. 2019 à 17:08, Michael Mior  a écrit :
>
> > You're right that there are several types which are not supported by
> > the Cassandra adapter. We would happily accept pull requests to add
> > support for new types.
> >
> > You're also correct that Cassandra cannot efficiently execute queries
> > which do not specify the partition key. Calcite will make those
> > queries more efficient, but it can make it easier to execute queries
> > that CQL does not directly support. Ultimately data is still stored
> > based on the partition key, so if your query does not specify a
> > partition key, Calcite will still need to issue an expensive
> > cross-partition query to Cassandra.
> > --
> > Michael Mior
> > mm...@apache.org
> >
> > Le mer. 16 oct. 2019 à 07:57, Yanna elina  a
> > écrit :
> > >
> > > Hi guys ,
> > >
> > > I study Calcite the benefits that a Cassandra-Calcite Adapter can bring ,
> > > as for example brings the possibility of join.
> > >
> > > the problem type defined into CassandraSchema.getRelDataType(..) is very
> > > limited
> > > some important type are missing  boolean / array ect...
> > >
> > > I thought inherited from the class CassandraSchema for Override  this
> > > method and add more type but this method is used inside CassandraTable
> > too.
> > >
> > > i would like to avoid  to re-write fully this adapter  :)
> > >
> > > do you have suggestions?
> > >
> > > My second question  is : Cassandra is not optimized to have WHERE on key
> > > not defined on cluster/partition key. I was wondering if calcite could
> > play
> > > a role without this mechanism to improve performance
> > >
> > >
> > > Thank !
> > >
> > > Yanna
> >


[jira] [Created] (CALCITE-3423) Support using CAST operation and bool type value in table macro

2019-10-17 Thread Wang Yanlin (Jira)
Wang Yanlin created CALCITE-3423:


 Summary: Support using CAST operation and bool type value in table 
macro
 Key: CALCITE-3423
 URL: https://issues.apache.org/jira/browse/CALCITE-3423
 Project: Calcite
  Issue Type: New Feature
Reporter: Wang Yanlin


Currently, using bool type or cast operation in table macro, got exception.

Add the code snippet in *JdbcTest* to reproduce.
 
{code:java}
// check for cast
resultSet = connection.createStatement().executeQuery(
"select * from table(\"s\".\"Str\"(MAP['a', 1, 'baz', 2], cast(1 as 
bigint))) as t(n)");
assertThat(CalciteAssert.toString(resultSet),
equalTo("N={'a'=1, 'baz'=2}\n"
+ "N=1   \n"));
// check for bool type
resultSet = connection.createStatement().executeQuery(
"select * from table(\"s\".\"Str\"(MAP['a', 1, 'baz', 2], true)) as 
t(n)");
assertThat(CalciteAssert.toString(resultSet),
equalTo("N={'a'=1, 'baz'=2}\n"
+ "N=true\n"));
// check for nested cast
resultSet = connection.createStatement().executeQuery(
"select * from table(\"s\".\"Str\"(MAP['a', 1, 'baz', 2],"
+ "cast(cast(1 as int) as varchar(1 as t(n)");
assertThat(CalciteAssert.toString(resultSet),
equalTo("N={'a'=1, 'baz'=2}\n"
+ "N=1   \n"));
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: CassandraAdapter (Add Type) and WHERE statement.

2019-10-17 Thread Yanna elina
Thank for reply Michael.

yes i understood  this on the documentation for example with "WHERE"
statement   calcite i  force the . "ALLOW FILTERING; "
and this can be expensive.

 I think there may be an interesting approach using STREAM.

for example maintain a regular update between a cassandra TABLE and a
STREAM TABLE.

CASSANDRA_TABLE_A .(SELECT * FROM TABLE_A) > STREAM_TABLE_A .
SELECT STREAM * FROM STREAM_TABLE_A WHERE username = 'JmuhsAaMdw'

i guess it will be more efficient to directly make the WHERE from the
STREAM than the cassandra_adapter  using "allow filtering"
a synchronization strategy can be set up between the cassandra table and
the STREAM table
what is your opinion about this approach ?
Thanks !
Yana


Le mer. 16 oct. 2019 à 17:08, Michael Mior  a écrit :

> You're right that there are several types which are not supported by
> the Cassandra adapter. We would happily accept pull requests to add
> support for new types.
>
> You're also correct that Cassandra cannot efficiently execute queries
> which do not specify the partition key. Calcite will make those
> queries more efficient, but it can make it easier to execute queries
> that CQL does not directly support. Ultimately data is still stored
> based on the partition key, so if your query does not specify a
> partition key, Calcite will still need to issue an expensive
> cross-partition query to Cassandra.
> --
> Michael Mior
> mm...@apache.org
>
> Le mer. 16 oct. 2019 à 07:57, Yanna elina  a
> écrit :
> >
> > Hi guys ,
> >
> > I study Calcite the benefits that a Cassandra-Calcite Adapter can bring ,
> > as for example brings the possibility of join.
> >
> > the problem type defined into CassandraSchema.getRelDataType(..) is very
> > limited
> > some important type are missing  boolean / array ect...
> >
> > I thought inherited from the class CassandraSchema for Override  this
> > method and add more type but this method is used inside CassandraTable
> too.
> >
> > i would like to avoid  to re-write fully this adapter  :)
> >
> > do you have suggestions?
> >
> > My second question  is : Cassandra is not optimized to have WHERE on key
> > not defined on cluster/partition key. I was wondering if calcite could
> play
> > a role without this mechanism to improve performance
> >
> >
> > Thank !
> >
> > Yanna
>


[DISCUSS] Make Avatica more discoverable

2019-10-17 Thread Francis Chuang

This was one of the comments on the October Board report for Calcite:

  df: Great progress!

  About Avatica - I had to google to find the subproject as I did
  not see anything obvious on the Calcite site. I would also be
  good to provide more status on the subproject in your future
  reports.

The Avatica sub-project is currently linked on Calcite's homepage, but 
it is not very noticeable or discoverable.


Any thoughts on how we can improve the visibility of Avatica?

Francis


Re: [DISCUSS] Make Enumerable operators responsive to interrupts

2019-10-17 Thread Vladimir Sitnikov
Roman Elizarov raises valid points re 'interrupts are too hard (or even
impossible) to get right':
https://twitter.com/relizarov/status/1184460504238100480

Vladimir


Re: [DISCUSS] Make Enumerable operators responsive to interrupts

2019-10-17 Thread Stamatis Zampetakis
I agree with both points.

There are projects which do not handle interrupts in the best possible way.
My most recent experience was with H2 [1] where the database breaks
completely if a single thread is interrupted.

Best,
Stamatis

[1] https://github.com/h2database/h2database/issues/227

On Wed, Oct 16, 2019 at 10:10 AM Vladimir Sitnikov <
sitnikov.vladi...@gmail.com> wrote:

> Statamis,
> "cooperative to interrupt" sounds a nice idea, however, I have been bitten
> multiple times by improper interrupt handling (not really with Calcite, but
> with other projects).
>
> In other words, it is good when everybody supports that.
> However, the other libraries might receive unexpected
> "interruptedexception", and they might go off rails.
>
> For example, suppose you are implementing logger.info(...). What do you do
> if you get an exception while logging?
> Do you attempt to log it again? Do you attempt to log to System.err?
> That is puzzling, and I have seen a case when the software performed 3
> attempts, then it stopped logging completely
> because it thought "the logfile is broken".
>
> So:
> 1) It might worth adding "interrupted" checks in the executors
> 2) If Calcite ever uses .interrupt(), then it should be configurable (e.g.
> to avoid cases when .interrupt() kills not that well-prepared code)
>
> Vladimir
>


Re: [DISCUSSION] Extension of Metadata Query

2019-10-17 Thread XING JIN
BTW, I think one JIRA number discussed in the thread would be
https://issues.apache.org/jira/browse/CALCITE-2855 not CALCITE-2885

Best,
Jin

XING JIN  于2019年10月17日周四 下午3:49写道:

> 1. RelMetadataQuery covers the functionality of MetadataFactory, why
> should we keep/maintain both of them ? shall we just
> deprecate MetadataFactory. I see MetadataFactory is rarely used in current
> code. Also I think MetadataFactory is not good place to offering customized
> metadata, which will make user confused for the difference
> between RelMetadataQuery and MetadataFactory.
>
> > Customized RelMetadataQuery with code generated meta handler for
> customized metadata, also can provide convenient way to get metadata.
> It makes sense for me.
>
> 2. If the natural lifespan of a RelMetadataQuery is a RelOptCall, shall we
> deprecate RelOptCluster#getMetadataQuery ? If a user wants to get the
> metadata but without a RelOptCall, he/she will need to create a new
> instance of RelMetadataQuery.
>
> Xiening Dai  于2019年10月17日周四 上午2:27写道:
>
>> I have seen both patterns in current code base. In most places, for
>> example SubQueryRemoveRule, AggregateUnionTrasposeRule
>> SortJoinTransposeRule, etc., RelOptCluster.getMetadataQuery() is used. And
>> there are a few other places where new RelMetadataQuery instance is
>> created, which Haisheng attempts to fix.
>>
>> Currently RelOptCluster.invalidateMetadataQuery() is called at the end of
>> RelOptRuleCall.transformTo(). So the lifespan of RelMetadataQuery is
>> guaranteed to be within a RelOptCall. I think Haisheng’s fix is safe.
>>
>>
>> > On Oct 16, 2019, at 1:53 AM, Danny Chan  wrote:
>> >
>> > This is the reason I was struggling for the discussion.
>> >
>> > Best,
>> > Danny Chan
>> > 在 2019年10月16日 +0800 AM11:23,dev@calcite.apache.org,写道:
>> >>
>> >> RelMetadataQuery
>>
>>


Re: [DISCUSSION] Extension of Metadata Query

2019-10-17 Thread XING JIN
1. RelMetadataQuery covers the functionality of MetadataFactory, why should
we keep/maintain both of them ? shall we just deprecate MetadataFactory. I
see MetadataFactory is rarely used in current code. Also I
think MetadataFactory is not good place to offering customized metadata,
which will make user confused for the difference between RelMetadataQuery
and MetadataFactory.

> Customized RelMetadataQuery with code generated meta handler for
customized metadata, also can provide convenient way to get metadata.
It makes sense for me.

2. If the natural lifespan of a RelMetadataQuery is a RelOptCall, shall we
deprecate RelOptCluster#getMetadataQuery ? If a user wants to get the
metadata but without a RelOptCall, he/she will need to create a new
instance of RelMetadataQuery.

Xiening Dai  于2019年10月17日周四 上午2:27写道:

> I have seen both patterns in current code base. In most places, for
> example SubQueryRemoveRule, AggregateUnionTrasposeRule
> SortJoinTransposeRule, etc., RelOptCluster.getMetadataQuery() is used. And
> there are a few other places where new RelMetadataQuery instance is
> created, which Haisheng attempts to fix.
>
> Currently RelOptCluster.invalidateMetadataQuery() is called at the end of
> RelOptRuleCall.transformTo(). So the lifespan of RelMetadataQuery is
> guaranteed to be within a RelOptCall. I think Haisheng’s fix is safe.
>
>
> > On Oct 16, 2019, at 1:53 AM, Danny Chan  wrote:
> >
> > This is the reason I was struggling for the discussion.
> >
> > Best,
> > Danny Chan
> > 在 2019年10月16日 +0800 AM11:23,dev@calcite.apache.org,写道:
> >>
> >> RelMetadataQuery
>
>