[jira] [Created] (CALCITE-1886) Support LIMIT [offset,] row_count
Chen Xin Yu created CALCITE-1886: Summary: Support LIMIT [offset,] row_count Key: CALCITE-1886 URL: https://issues.apache.org/jira/browse/CALCITE-1886 Project: Calcite Issue Type: Improvement Reporter: Chen Xin Yu Assignee: Julian Hyde -- This message was sent by Atlassian JIRA (v6.4.14#64029)
how to unload calcite code generator class!
In the high concurrency scene, I find calcite generator many class in perm and it cant be gc. after sometime , it let the jvm genera many full gc. can i unload that class ?
[jira] [Created] (CALCITE-1885) calcite code gen make many class in perm and it cant be gc
yiming.xu created CALCITE-1885: -- Summary: calcite code gen make many class in perm and it cant be gc Key: CALCITE-1885 URL: https://issues.apache.org/jira/browse/CALCITE-1885 Project: Calcite Issue Type: Bug Reporter: yiming.xu Assignee: Julian Hyde calcite code gen make many class in perm and it cant be gc, how to unload that class -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (CALCITE-1884) dateStringToUnixDate() / unixDateToString() returns wrong results
Haohui Mai created CALCITE-1884: --- Summary: dateStringToUnixDate() / unixDateToString() returns wrong results Key: CALCITE-1884 URL: https://issues.apache.org/jira/browse/CALCITE-1884 Project: Calcite Issue Type: Bug Components: avatica Affects Versions: 1.13.0 Reporter: Haohui Mai dateStringToUnixDate() / unixDateToString() do not return consistent result. The following test fails: {noformat} @Test public void testUnixDate() { int days = DateTimeUtils.dateStringToUnixDate("1500-04-30"); assertEquals("1500-04-30", DateTimeUtils.unixDateToString(days)); } {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
Re: Is Avatica's ResultSetResponse's Signature field always present?
Thanks for clarifying! I've opened https://github.com/apache/calcite-avatica/pull/10 to make this clearer. Cheers, Francis On 12/07/2017 1:28 AM, Josh Elser wrote: There's one point I want to bring up first about "optional" fields. Every attribute on Avatica's messages are (should be) listed as optional. This is how we correctly handle a "drift" in the protocol itself. If we have fields marked as required, we would never be able to change them which may cause problems. It would probably be good to work towards tying docs to a specific version so we can remove this ambiguity :) To answer your question, no, there will be no Signature for INSERT/UPSERT operations (any operation which returns a number of rows affected instead of a ResultSet). For SQL which generate a ResultSet (some rows of data), the Signature would "always" be provided. On 7/11/17 4:38 AM, F21 wrote: I have a bug report for the Go Avatica driver where someone executed an `UPSERT` statement and caused the driver to crash. See https://github.com/Boostport/avatica/issues/34 The driver crashed, because we tried to read `ResultSetResponse.Signature` and it was null as the statement was an upsert statement. According to the protobuf documentation [0], signature is non-optional and should always be present. Does this guarantee extend to data modification statements like UPSERT? Cheers, Francis [0] https://calcite.apache.org/avatica/docs/protobuf_reference.html#resultsetresponse
Re: [DISCUSS] Draft board report
Looks good. I would restore the line - Last PMC addition was Michael Mior on Mon Apr 03 2017 because the Board likes to monitor PMC & committer development. I also get a sense that we are attracting a more diverse set of contributors than usual. The last 100 commits had 29 distinct contributors[1]. Typically each 100 commits has around 20 distinct contributors. Thanks for writing the report! Julian [1] git log origin/master |grep Author|awk 'FNR < 100 {print}'|sort -u|wc On Tue, Jul 11, 2017 at 2:49 PM, Jesus Camacho Rodriguezwrote: > Calcite community, > > I attach the draft of the report I propose to file for the 7/19 Apache > board meeting. > > Please, let me know if you have any feedback. > > Thanks, > Jesús > > > --- > > > Attachment O: Report from the Apache Calcite Project [Jesús Camacho > Rodríguez] > > ## Description: > Apache Calcite is a highly customizable framework for parsing and > planning queries on data in a wide variety of formats. It allows > database-like access, and in particular a SQL interface and advanced > query optimization, for data not residing in a traditional database. > > Avatica is a sub-module within Calcite, and provides a framework > for building local and remote JDBC and ODBC database drivers. Avatica > has an independent release schedule, and since April 2017, it has its > own independent repository. > > ## Issues: > - There are no issues requiring board attention at this time. > > ## Activity: > > Development and mailing list activity is steady for both Calcite and > its Avatica sub-project. > > Since the last board meeting, there has been one Calcite release > and one Avatica release. > > Avatica 1.10.0 was released at the end of May. As the Calcite and > Avatica projects become more separate, this was the first release > since Avatica’s git repository separated from Calcite’s repository > during the previous quarter. The release added support for JDBC Array > data, Docker, and JDK 9 (it continues to run on JDK 7 and 8). > In total, there were over 20 new features and bug fixes. > > In turn, Calcite 1.13.0 was released at the end of June. The release > included more than 75 resolved issues, comprising a large number of > new features as well as general improvements and bug-fixes. Among > others, Calcite was upgraded to use the recently released version of > Avatica. > > Our community continued growing this quarter: three new committers > (Slim Bouguerra, Kevin Liew, and Zhiqiang He) were added to the project. > > Finally, there was an important presence of the Apache Calcite project > in talks at multiple events, such as Apache: Big Data North America 2017 > (Miami, FL), PhoenixCon (San Francisco, CA) and > DataWorks Summit USA 2017 (San Jose, CA). > > ## Health report: > > Activity levels on mailing lists, git and JIRA are normal for both > Calcite and Avatica. > > ## PMC changes: > > - Currently 16 PMC members. > - No new PMC members added in the last 3 months > > ## Committer base changes: > > - Currently 25 committers. > - New commmitters: > - Slim Bouguerra was added as a committer on Sun Jun 18 2017 > - Kevin Liew was added as a committer on Sun Jun 18 2017 > - Zhiqiang He was added as a committer on Fri Jun 09 2017 > > ## Releases: > > - 1.13.0 was released on Mon Jun 26 2017 > - avatica-1.10.0 was released on Tue May 30 2017 > > ## JIRA activity: > > - 135 JIRA tickets created in the last 3 months > - 112 JIRA tickets closed/resolved in the last 3 months > >
[DISCUSS] Draft board report
Calcite community, I attach the draft of the report I propose to file for the 7/19 Apache board meeting. Please, let me know if you have any feedback. Thanks, Jesús --- Attachment O: Report from the Apache Calcite Project [Jesús Camacho Rodríguez] ## Description: Apache Calcite is a highly customizable framework for parsing and planning queries on data in a wide variety of formats. It allows database-like access, and in particular a SQL interface and advanced query optimization, for data not residing in a traditional database. Avatica is a sub-module within Calcite, and provides a framework for building local and remote JDBC and ODBC database drivers. Avatica has an independent release schedule, and since April 2017, it has its own independent repository. ## Issues: - There are no issues requiring board attention at this time. ## Activity: Development and mailing list activity is steady for both Calcite and its Avatica sub-project. Since the last board meeting, there has been one Calcite release and one Avatica release. Avatica 1.10.0 was released at the end of May. As the Calcite and Avatica projects become more separate, this was the first release since Avatica’s git repository separated from Calcite’s repository during the previous quarter. The release added support for JDBC Array data, Docker, and JDK 9 (it continues to run on JDK 7 and 8). In total, there were over 20 new features and bug fixes. In turn, Calcite 1.13.0 was released at the end of June. The release included more than 75 resolved issues, comprising a large number of new features as well as general improvements and bug-fixes. Among others, Calcite was upgraded to use the recently released version of Avatica. Our community continued growing this quarter: three new committers (Slim Bouguerra, Kevin Liew, and Zhiqiang He) were added to the project. Finally, there was an important presence of the Apache Calcite project in talks at multiple events, such as Apache: Big Data North America 2017 (Miami, FL), PhoenixCon (San Francisco, CA) and DataWorks Summit USA 2017 (San Jose, CA). ## Health report: Activity levels on mailing lists, git and JIRA are normal for both Calcite and Avatica. ## PMC changes: - Currently 16 PMC members. - No new PMC members added in the last 3 months ## Committer base changes: - Currently 25 committers. - New commmitters: - Slim Bouguerra was added as a committer on Sun Jun 18 2017 - Kevin Liew was added as a committer on Sun Jun 18 2017 - Zhiqiang He was added as a committer on Fri Jun 09 2017 ## Releases: - 1.13.0 was released on Mon Jun 26 2017 - avatica-1.10.0 was released on Tue May 30 2017 ## JIRA activity: - 135 JIRA tickets created in the last 3 months - 112 JIRA tickets closed/resolved in the last 3 months
Re: Explain Plan for aggregating a single column in CSV Adapter
Hi, If I change CsvTranslatableTable so that it implements ProjectableFilterableTable instead of TranslatableTable and implement the scan method, Calcite's own rules apply and the plan gets right, scanning only the used field in the aggregate function. However, now I realized that "select count(*) from EMPS" generates the plan: EnumerableAggregate(group=[{}], EXPR$0=[COUNT()]) CsvTableScan(table=[[SALES, EMPS]], fields=[[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]]) "select * from EMPS" generates the plan: CsvTableScan(table=[[SALES, EMPS]], fields=[[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]]) Notice that the count(*) generates a plan that scans all fields, requiring to convert them all without the need. Even when using ProjectableFilterableTable plan scans all fields, but the plan for "select count(name) from EMPS" scans just one field. What could be the best approach to handle the count(*) without having to scan all fields? Best regards, Luis Fernando Em Quinta-feira, 6 de Julho de 2017 18:05, Julian Hydeescreveu: Calcite should realize that Aggregate has an implied Project (because it only uses a few columns) and push that projection into the CsvTableScan, but it doesn’t. I think we need a new rule for Aggregate on a TableScan of a ProjectableFilterableTable. Can you create a JIRA case please? I created a test case. It currently fails: diff --git a/example/csv/src/test/java/org/apache/calcite/test/CsvTest.java b/example/csv/src/test/java/org/apache/calcite/test/CsvTest.java index 00c59ee..2402872 100644 --- a/example/csv/src/test/java/org/apache/calcite/test/CsvTest.java +++ b/example/csv/src/test/java/org/apache/calcite/test/CsvTest.java @@ -241,6 +241,13 @@ public Void apply(ResultSet resultSet) { .ok(); } + @Test public void testAggregateImpliesProject() throws SQLException { +final String sql = "select max(name) from EMPS"; +final String plan = "PLAN=EnumerableAggregate(group=[{}], EXPR$0=[MAX($0)])\n" ++ " CsvTableScan(table=[[SALES, EMPS]], fields=[[1]])\n"; +sql("smart", "explain plan for " + sql).returns(plan).ok(); + } + @Test public void testFilterableSelect() throws SQLException { sql("filterable-model", "select name from EMPS").ok(); } Julian > On Jul 6, 2017, at 1:23 PM, Luis Fernando Kauer > wrote: > > Hi, > I'm trying to understand the CSV Adapter and how the rules are fired.The > CsvProjectTableScanRule gets fired when I use CsvTranslatableTable.But I'm > not understanding why I'm getting a plan that scans all fields when I use an > aggregate function.For example:explain plan for select name from > emps;CsvTableScan(table=[[SALES, EMPS]], fields=[[1]]) > > explain plan for select max(name) from emps;EnumerableAggregate(group=[{}], > EXPR$0=[MAX($1)])CsvTableScan(table=[[SALES, EMPS]], fields=[[0, 1, 2, 3, 4, > 5, 6, 7, 8, 9]]) > I noticed that the rule gets fired and at that point it shows just 1 field > being used.But the last time CsvTableScan.deriveRowType() gets called it has > all the fields set, and it's not the instance create by the rule, but the > first instance created with all the fields. > Can anybody explain me if this is a bug or if this is supposed to happen with > aggregate functions ? > Best regards, > Luis Fernando Kauer
Re: Is Avatica's ResultSetResponse's Signature field always present?
There's one point I want to bring up first about "optional" fields. Every attribute on Avatica's messages are (should be) listed as optional. This is how we correctly handle a "drift" in the protocol itself. If we have fields marked as required, we would never be able to change them which may cause problems. It would probably be good to work towards tying docs to a specific version so we can remove this ambiguity :) To answer your question, no, there will be no Signature for INSERT/UPSERT operations (any operation which returns a number of rows affected instead of a ResultSet). For SQL which generate a ResultSet (some rows of data), the Signature would "always" be provided. On 7/11/17 4:38 AM, F21 wrote: I have a bug report for the Go Avatica driver where someone executed an `UPSERT` statement and caused the driver to crash. See https://github.com/Boostport/avatica/issues/34 The driver crashed, because we tried to read `ResultSetResponse.Signature` and it was null as the statement was an upsert statement. According to the protobuf documentation [0], signature is non-optional and should always be present. Does this guarantee extend to data modification statements like UPSERT? Cheers, Francis [0] https://calcite.apache.org/avatica/docs/protobuf_reference.html#resultsetresponse
Is Avatica's ResultSetResponse's Signature field always present?
I have a bug report for the Go Avatica driver where someone executed an `UPSERT` statement and caused the driver to crash. See https://github.com/Boostport/avatica/issues/34 The driver crashed, because we tried to read `ResultSetResponse.Signature` and it was null as the statement was an upsert statement. According to the protobuf documentation [0], signature is non-optional and should always be present. Does this guarantee extend to data modification statements like UPSERT? Cheers, Francis [0] https://calcite.apache.org/avatica/docs/protobuf_reference.html#resultsetresponse
Re: Kerberos Authentication and Avatica
Thanks for the pointers, Josh :) I'll post back to the list when a release has been tagged. On 11/07/2017 11:38 AM, Josh Elser wrote: On Jul 10, 2017 20:18, "F21"wrote: Hey Josh, Thanks for clearing things up. In Go, it is not idiomatic for a database driver to reach out to environment variables. I think I will add an additional parameter called `krb5Conf` for users to point the driver to the location of `krb5.conf`. In the event that it is not provided, I plan to search common locations listed here: https://www.ibm.com/support/kn owledgecenter/en/SSAW57_8.5.5/com.ibm.websphere.nd.doc/ae/rs ec_SPNEGO_config_krb5.html and https://docs.oracle.com/javase /8/docs/technotes/guides/security/jgss/tutorials/KerberosReq.html Sounds reasonable to me! Regarding the use-case where the user performs authentication and passes the ticket to Avatica, what does the driver configuration look like? In particular, if I were using the Java driver, is it correct to assume that I'd set `authentication` to `SPNEGO` and leave `keytab` and `principal` as blank? In that case, I am assuming the Java Kerberos library would find the cached ticket and set up the appropriate HTTP requests. Exactly right. The user does nothing special, and then the underlying Java security code provides it when the HTTP client library asks for the ticket. Cheers, Francis On 11/07/2017 12:49 AM, Josh Elser wrote: Hey Francis, On 7/10/17 7:09 AM, F21 wrote: Follow up questions: - According to the client reference for the principal parameter [0], the Java client is able to perform a Kerberos login before contacting the Avatica server. There appears to be no way to set the KDC address into the client. How does the Java client perform Kerberos logins? This is convention for Java. There are expected locations at which a file, krb5.conf, is located on platforms. For Linux, this is /etc/krb5.conf. - There is also an option for the user to perform the login themselves. In this case, how does the Java client pass the Kerberos ticket to the Avatica server? Again, convention. On Linux, the location of a user's ticket cache is defined to be /tmp/krb5cc_$(id -u $(whoami)). This location can be overriden by the environment variable KRB5CCNAME. All of this is handled by Java itself. This is definitely the common case for interactive users. [0] https://calcite.apache.org/avatica/docs/client_reference.htm l#principal On 10/07/2017 3:57 PM, F21 wrote: Recently, I came across a maintained pure-go kerberos client and server [0]. I am now in the process of adding SPNEGO authentication to the Go avatica client [1]. For the implementation, the plan is to make it as close to the official (java) client's implementation as possible. For SPNEGO, to Java client uses these 2 parameters: principal and keytab. The keytab parameter is easy to understand: a path to a keytab file. I'd like to confirm what a valid string for the principal looks like. - Is it a Service Principal Name? - What are the valid formats for the principal? A valid SPN looks like User1/User2@realm. - For the above example, I am assuming user2 can be optional. - Can the realm be optional? See http://web.mit.edu/kerberos/krb5-1.5/krb5-1.5.4/doc/krb5-use r/What-is-a-Kerberos-Principal_003f.html. This page does a very good job at concisely expressing what a Kerberos principal is and what can be implied (based on krb5.conf). Let me know if you still have questions. Cheers, Francis [0] https://github.com/jcmturner/gokrb5 [1] https://github.com/Boostport/avatica