Hi Julian, Stamatis, Seung-Hwan,

@Julian, Thank you for suggesting JMH for performance analysis. I prepared
a TPC-H benchmark test using JMH and  as you already mentioned, the time
for each phases reduced drastically. I was surprised and the same time
happy to see the benchmark results.
@Stamatis, Yes you are correct. The support for adding the custom rules in
Calcite will beat the overall performance of built-in rules-set of any
RDBMS. I can convince my guide with regards to these points .
@Seung-Hwan, Thank you so much for your interest in collaborating with us.
I'll be also very happy to contribute towards the benchmark analysis for
Calcite.

The JMH benchmark test for TPC-H and the results of its comparison with
postgres is also attached to this email. Please find them and I appreciate
your feedback in this regard.

Thanking you

Lekshmi B.G
Email: [email protected]




On Mon, Dec 31, 2018 at 3:53 PM Lim, Seung-Hwan <[email protected]>
wrote:

> Hi Lekshmi,
>
> I am one of the members in Edmon Begoli’s team who did preliminary work on
> comparison Calcite with conventional RDBMS, especially postgresql.
>
> The major challenge that we’ve identified is that many benchmarks (e.g.,
> TPC-H, TPC-DS) evaluates the performance of ‘Join’ operations. For both
> benchmarks, Calcite often generated less optimized plan than RDBMS.
>
> I’ll be very happy to help you or collaborate with you in this regard.
>
> Thank you,
> Seung-Hwan
>
>
> > On Dec 31, 2018, at 8:33 AM, Stamatis Zampetakis <[email protected]>
> wrote:
> >
> > Hi Lekshmi,
> >
> > Thanks for the interesting information. It is good to see more people
> > involved in the benchmark and optimizations on Calcite.
> >
> > However, I am not sure I understand what you are trying to achieve by
> > performing an all-in-all comparison between Calcite and other databases
> (in
> > the particular case with Postgres).
> > Calcite provides you everything you need to build a database but itself
> is
> > not a database. Could you possibly share a bit more information on what
> you
> > are expecting to gain from these kind of experiments.
> >
> > On the other hand, it would be very interesting to compare individual
> parts
> > of Calcite (e.g., optimizer) with the respective ones of Postgres (or
> other
> > database) although this will not be easy.
> > If for instance, you want to compare the optimizers in terms of
> > performance, time may not be a good metric since C code will almost
> always
> > be faster than Java code.
> > Another comparison axe for the optimizer, could be the quality of the
> > produced plans but finding a good metric can be also challenging.
> Measuring
> > the quality of the plan could be based on the execution time of the plan
> on
> > the same engine (all in Calcite or all in Postgres for instance). In
> terms
> > of research, I guess it would be very nice to demonstrate that a Volcano
> > optimizer (Calcite) with a custom rule-set can beat the built-in
> optimizer
> > of Postgres in terms of plan quality; plus it would be very useful for
> many
> > end-users of Calcite to have a rule-set that simulates the optimizer of
> > Postgres (or another database).
> >
> > As a general comment, I think it would be easier to find good use cases
> in
> > favor of Calcite if you emphasize in data integration scenarios,
> > cross-database queries,  querying raw data (not in a database) and/or
> > systems without an optimizer.
> >
> > Best,
> > Stamatis
> >
> > Στις Δευ, 31 Δεκ 2018 στις 12:35 μ.μ., ο/η Lekshmi <
> [email protected]>
> > έγραψε:
> >
> >> Hi Julian,
> >>
> >> Thanks for a lot for the prompt response and support. I will try running
> >> the test with JMH and will let you know the feedback.
> >>
> >> I wish you all have a prosperous new year.
> >>
> >> Thanks and Regards
> >>
> >> Lekshmi B.G
> >> Email: [email protected]
> >>
> >>
> >>
> >>
> >> On Mon, Dec 31, 2018 at 10:38 AM Julian Feinauer <
> >> [email protected]> wrote:
> >>
> >>> Hi Lekshmi,
> >>>
> >>> your activity sounds very interesting.
> >>> One important thing to note is that Performance testing in Java is
> always
> >>> tricky due to JIT and "warmup" phase of the JVM. Thus it is generally
> >>> recommended to do these tests with JMH (
> >>> https://openjdk.java.net/projects/code-tools/jmh/).
> >>>
> >>> I would assume that the time for sql2rel reduces drastically (perhaps
> one
> >>> or two orders) when run with JMH.
> >>>
> >>> Best
> >>> Julian
> >>>
> >>> Am 30.12.18, 23:12 schrieb "Lekshmi" <[email protected]>:
> >>>
> >>>    Hello Folks,
> >>>
> >>>    For my research activities, I was trying to perform a benchmark
> >>> comparison
> >>>    between calcite with other database systems.  As an initial step, I
> >> was
> >>>    trying to do it for *Calcite* and *PostgresSql*. So, I thought TPCH
> >>> queries
> >>>    were the right thing to start with. I tried running the TpchTest (
> >>>
> >>>
> >>
> https://github.com/apache/calcite/blob/master/plus/src/test/java/org/apache/calcite/adapter/tpch/TpchTest.java
> >>> )
> >>>    by adding the *CalciteTimingTracer* in the junit tests to determine
> >> the
> >>>    execution time. While doing so, I could see that the execution time
> >> in
> >>>    calcite is significantly higher compared to postgresSql. On further
> >>>    investigation, I could see that we generate the required datas
> >>> required for
> >>>    these queries(which comes around 150,000 for some tables) and I was
> >>> under
> >>>    an impression that most of the time was spend on the data generation
> >>> and
> >>>    that the query execution could be faster. So, I modified the
> relevant
> >>>    schema class (
> >>>
> >>>
> >>
> https://github.com/apache/calcite/blob/master/plus/src/main/java/org/apache/calcite/adapter/tpch/TpchSchema.java
> >>> )
> >>>    to perform the data generation and query execution separately. Then,
> >> I
> >>>    traced the time took for just query execution. Even, then there was
> a
> >>>    significant difference from that of PostgresSql.
> >>>
> >>>    I, also enabled the *log4j.rootLogger* to *TRACE * to find the time
> >>> spend
> >>>    for sql2rel and optimization phases of the class Prepare
> >>>    <
> >>>
> >>>
> >>
> https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/prepare/Prepare.java
> >>>> .
> >>>    And, to my surprise, I could see that calcite takes a time of 355ms
> >> for
> >>>    sql2rel and 352ms for optimization for the junit test *testQuery01*.
> >>> On the
> >>>    other side, the same query gave a planning time of 0.163ms in
> >> Postgres.
> >>>
> >>>    I would like to know, if this is the right way to test the
> >> performance
> >>> of
> >>>    TPCH queries using apache calcite. Can anyone let me know if there
> >>> exist
> >>>    any better ways to do it.
> >>>
> >>>    And, while searching through JIRA, I could find a ticket
> >>>    https://issues.apache.org/jira/browse/CALCITE-2169 which was
> created
> >>> by
> >>>    Edmon Begoli for performing a comparative performance study of the
> >>> calcite
> >>>    framework. I think, its related to my current problem. I have no
> idea
> >>>    regarding the status of the ticket. It would be really great if
> >> someone
> >>>    could help me with some information on it.
> >>>
> >>>    Also, now coming to the personal preference, I would like to
> continue
> >>> my
> >>>    research in calcite due to its simplicity and extensibility.  But,
> >> if I
> >>>    fail to give a good case study in favour of Calcite, I am afraid
> >> that I
> >>>    could loose an opportunity to work with calcite.
> >>>
> >>>    Thanks and Regards
> >>>
> >>>    Lekshmi B.G
> >>>    Email: [email protected]
> >>>
> >>>
> >>>
> >>
>
>
/**
 * 
 */
package org.fau.tpch.benchmark;

import java.util.List;
import java.util.concurrent.TimeUnit;

import org.apache.calcite.adapter.tpch.TpchSchema;
import org.apache.calcite.test.CalciteAssert;
import org.openjdk.jmh.annotations.Benchmark;
import org.openjdk.jmh.annotations.BenchmarkMode;
import org.openjdk.jmh.annotations.Fork;
import org.openjdk.jmh.annotations.Measurement;
import org.openjdk.jmh.annotations.Mode;
import org.openjdk.jmh.annotations.OutputTimeUnit;
import org.openjdk.jmh.annotations.Scope;
import org.openjdk.jmh.annotations.Setup;
import org.openjdk.jmh.annotations.State;
import org.openjdk.jmh.annotations.Threads;
import org.openjdk.jmh.annotations.Warmup;

import com.google.common.collect.ImmutableList;

/**
 * @author lekshmi
 *
 */
@Fork(value = 5, jvmArgsPrepend = "-Xmx2048m")
@Measurement(iterations = 5, time = 1, timeUnit = TimeUnit.MILLISECONDS)
@Warmup(iterations = 5, time = 1, timeUnit = TimeUnit.MILLISECONDS)
@State(Scope.Thread)
@Threads(1)
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.MILLISECONDS)
public class TpchBenchmarkTest
{
	private static String schema(String name, String scaleFactor)
	{
		return "     {\n"
				+ "       type: 'custom',\n"
				+ "       name: '" + name + "',\n"
				+ "       factory: 'org.apache.calcite.adapter.tpch.TpchSchemaFactory',\n"
				+ "       operand: {\n"
				+ "         columnPrefix: true,\n"
				+ "         scale: " + scaleFactor + "\n"
				+ "       }\n"
				+ "     }";
	}

	public static final String TPCH_MODEL = "{\n"
			+ "  version: '1.0',\n"
			+ "  defaultSchema: 'TPCH',\n"
			+ "   schemas: [\n"
			+ schema("TPCH", "1.0") + ",\n"
			+ schema("TPCH_01", "0.01") + ",\n"
			+ schema("TPCH_5", "5.0") + "\n"
			+ "   ]\n"
			+ "}";

	static final List<String> QUERIES = ImmutableList.of(
			// 01
			"select\n"
					+ "  l_returnflag,\n"
					+ "  l_linestatus,\n"
					+ "  sum(l_quantity) as sum_qty,\n"
					+ "  sum(l_extendedprice) as sum_base_price,\n"
					+ "  sum(l_extendedprice * (1 - l_discount)) as sum_disc_price,\n"
					+ "  sum(l_extendedprice * (1 - l_discount) * (1 + l_tax)) as sum_charge,\n"
					+ "  avg(l_quantity) as avg_qty,\n"
					+ "  avg(l_extendedprice) as avg_price,\n"
					+ "  avg(l_discount) as avg_disc,\n"
					+ "  count(*) as count_order\n"
					+ "from\n"
					+ "  tpch.lineitem\n"
					+ "-- where\n"
					// + "-- l_shipdate <= date '1998-12-01' - interval '120'
					// day (3)\n"
					+ "--  l_shipdate <= date '1998-12-01' - interval '117' day \n"
					+ "group by\n"
					+ "  l_returnflag,\n"
					+ "  l_linestatus\n"
					+ "\n"
					+ "order by\n"
					+ "  l_returnflag,\n"
					+ "  l_linestatus\n"
					+ "limit 1",

			// 02
			"select\n"
					+ "  s.s_acctbal,\n"
					+ "  s.s_name,\n"
					+ "  n.n_name,\n"
					+ "  p.p_partkey,\n"
					+ "  p.p_mfgr,\n"
					+ "  s.s_address,\n"
					+ "  s.s_phone,\n"
					+ "  s.s_comment\n"
					+ "from\n"
					+ "  tpch.part p,\n"
					+ "  tpch.supplier s,\n"
					+ "  tpch.partsupp ps,\n"
					+ "  tpch.nation n,\n"
					+ "  tpch.region r\n"
					+ "where\n"
					+ "  p.p_partkey = ps.ps_partkey\n"
					+ "  and s.s_suppkey = ps.ps_suppkey\n"
					+ "  and p.p_size = 41\n"
					+ "  and p.p_type like '%NICKEL'\n"
					+ "  and s.s_nationkey = n.n_nationkey\n"
					+ "  and n.n_regionkey = r.r_regionkey\n"
					+ "  and r.r_name = 'EUROPE'\n"
					+ "  and ps.ps_supplycost = (\n"
					+ "\n"
					+ "    select\n"
					+ "      min(ps.ps_supplycost)\n"
					+ "\n"
					+ "    from\n"
					+ "      tpch.partsupp ps,\n"
					+ "      tpch.supplier s,\n"
					+ "      tpch.nation n,\n"
					+ "      tpch.region r\n"
					+ "    where\n"
					+ "      p.p_partkey = ps.ps_partkey\n"
					+ "      and s.s_suppkey = ps.ps_suppkey\n"
					+ "      and s.s_nationkey = n.n_nationkey\n"
					+ "      and n.n_regionkey = r.r_regionkey\n"
					+ "      and r.r_name = 'EUROPE'\n"
					+ "  )\n"
					+ "\n"
					+ "order by\n"
					+ "  s.s_acctbal desc,\n"
					+ "  n.n_name,\n"
					+ "  s.s_name,\n"
					+ "  p.p_partkey\n"
					+ "limit 100",

			// 03
			"select\n"
					+ "  l.l_orderkey,\n"
					+ "  sum(l.l_extendedprice * (1 - l.l_discount)) as revenue,\n"
					+ "  o.o_orderdate,\n"
					+ "  o.o_shippriority\n"
					+ "\n"
					+ "from\n"
					+ "  tpch.customer c,\n"
					+ "  tpch.orders o,\n"
					+ "  tpch.lineitem l\n"
					+ "\n"
					+ "where\n"
					+ "  c.c_mktsegment = 'HOUSEHOLD'\n"
					+ "  and c.c_custkey = o.o_custkey\n"
					+ "  and l.l_orderkey = o.o_orderkey\n"
					+ "--  and o.o_orderdate < date '1995-03-25'\n"
					+ "--  and l.l_shipdate > date '1995-03-25'\n"
					+ "\n"
					+ "group by\n"
					+ "  l.l_orderkey,\n"
					+ "  o.o_orderdate,\n"
					+ "  o.o_shippriority\n"
					+ "order by\n"
					+ "  revenue desc,\n"
					+ "  o.o_orderdate\n"
					+ "limit 10",

			// 04
			"select\n"
					+ "  o_orderpriority,\n"
					+ "  count(*) as order_count\n"
					+ "from\n"
					+ "  tpch.orders\n"
					+ "\n"
					+ "where\n"
					+ "--  o_orderdate >= date '1996-10-01'\n"
					+ "--  and o_orderdate < date '1996-10-01' + interval '3' month\n"
					+ "--  and \n"
					+ "  exists (\n"
					+ "    select\n"
					+ "      *\n"
					+ "    from\n"
					+ "      tpch.lineitem\n"
					+ "    where\n"
					+ "      l_orderkey = o_orderkey\n"
					+ "      and l_commitdate < l_receiptdate\n"
					+ "  )\n"
					+ "group by\n"
					+ "  o_orderpriority\n"
					+ "order by\n"
					+ "  o_orderpriority",

			// 05
			"select\n"
					+ "  n.n_name,\n"
					+ "  sum(l.l_extendedprice * (1 - l.l_discount)) as revenue\n"
					+ "\n"
					+ "from\n"
					+ "  tpch.customer c,\n"
					+ "  tpch.orders o,\n"
					+ "  tpch.lineitem l,\n"
					+ "  tpch.supplier s,\n"
					+ "  tpch.nation n,\n"
					+ "  tpch.region r\n"
					+ "\n"
					+ "where\n"
					+ "  c.c_custkey = o.o_custkey\n"
					+ "  and l.l_orderkey = o.o_orderkey\n"
					+ "  and l.l_suppkey = s.s_suppkey\n"
					+ "  and c.c_nationkey = s.s_nationkey\n"
					+ "  and s.s_nationkey = n.n_nationkey\n"
					+ "  and n.n_regionkey = r.r_regionkey\n"
					+ "  and r.r_name = 'EUROPE'\n"
					+ "--  and o.o_orderdate >= date '1997-01-01'\n"
					+ "--  and o.o_orderdate < date '1997-01-01' + interval '1' year\n"
					+ "group by\n"
					+ "  n.n_name\n"
					+ "\n"
					+ "order by\n"
					+ "  revenue desc",

			// 06
			"select\n"
					+ "  sum(l_extendedprice * l_discount) as revenue\n"
					+ "from\n"
					+ "  tpch.lineitem\n"
					+ "where\n"
					+ "--  l_shipdate >= date '1997-01-01'\n"
					+ "--  and l_shipdate < date '1997-01-01' + interval '1' year\n"
					+ "--  and\n"
					+ "  l_discount between 0.03 - 0.01 and 0.03 + 0.01\n"
					+ "  and l_quantity < 24",

			// 07
			"select\n"
					+ "  supp_nation,\n"
					+ "  cust_nation,\n"
					+ "  l_year,\n"
					+ "  sum(volume) as revenue\n"
					+ "from\n"
					+ "  (\n"
					+ "    select\n"
					+ "      n1.n_name as supp_nation,\n"
					+ "      n2.n_name as cust_nation,\n"
					+ "      extract(year from l.l_shipdate) as l_year,\n"
					+ "      l.l_extendedprice * (1 - l.l_discount) as volume\n"
					+ "    from\n"
					+ "      tpch.supplier s,\n"
					+ "      tpch.lineitem l,\n"
					+ "      tpch.orders o,\n"
					+ "      tpch.customer c,\n"
					+ "      tpch.nation n1,\n"
					+ "      tpch.nation n2\n"
					+ "    where\n"
					+ "      s.s_suppkey = l.l_suppkey\n"
					+ "      and o.o_orderkey = l.l_orderkey\n"
					+ "      and c.c_custkey = o.o_custkey\n"
					+ "      and s.s_nationkey = n1.n_nationkey\n"
					+ "      and c.c_nationkey = n2.n_nationkey\n"
					+ "      and (\n"
					+ "        (n1.n_name = 'EGYPT' and n2.n_name = 'UNITED STATES')\n"
					+ "        or (n1.n_name = 'UNITED STATES' and n2.n_name = 'EGYPT')\n"
					+ "      )\n"
					+ "--      and l.l_shipdate between date '1995-01-01' and date '1996-12-31'\n"
					+ "  ) as shipping\n"
					+ "group by\n"
					+ "  supp_nation,\n"
					+ "  cust_nation,\n"
					+ "  l_year\n"
					+ "order by\n"
					+ "  supp_nation,\n"
					+ "  cust_nation,\n"
					+ "  l_year",

			// 08
			"select\n"
					+ "  o_year,\n"
					+ "  sum(case\n"
					+ "    when nation = 'EGYPT' then volume\n"
					+ "    else 0\n"
					+ "  end) / sum(volume) as mkt_share\n"
					+ "from\n"
					+ "  (\n"
					+ "    select\n"
					+ "      extract(year from o.o_orderdate) as o_year,\n"
					+ "      l.l_extendedprice * (1 - l.l_discount) as volume,\n"
					+ "      n2.n_name as nation\n"
					+ "    from\n"
					+ "      tpch.part p,\n"
					+ "      tpch.supplier s,\n"
					+ "      tpch.lineitem l,\n"
					+ "      tpch.orders o,\n"
					+ "      tpch.customer c,\n"
					+ "      tpch.nation n1,\n"
					+ "      tpch.nation n2,\n"
					+ "      tpch.region r\n"
					+ "    where\n"
					+ "      p.p_partkey = l.l_partkey\n"
					+ "      and s.s_suppkey = l.l_suppkey\n"
					+ "      and l.l_orderkey = o.o_orderkey\n"
					+ "      and o.o_custkey = c.c_custkey\n"
					+ "      and c.c_nationkey = n1.n_nationkey\n"
					+ "      and n1.n_regionkey = r.r_regionkey\n"
					+ "      and r.r_name = 'MIDDLE EAST'\n"
					+ "      and s.s_nationkey = n2.n_nationkey\n"
					+ "      and o.o_orderdate between date '1995-01-01' and date '1996-12-31'\n"
					+ "      and p.p_type = 'PROMO BRUSHED COPPER'\n"
					+ "  ) as all_nations\n"
					+ "group by\n"
					+ "  o_year\n"
					+ "order by\n"
					+ "  o_year",

			// 09
			"select\n"
					+ "  nation,\n"
					+ "  o_year,\n"
					+ "  sum(amount) as sum_profit\n"
					+ "from\n"
					+ "  (\n"
					+ "    select\n"
					+ "      n_name as nation,\n"
					+ "      extract(year from o_orderdate) as o_year,\n"
					+ "      l.l_extendedprice * (1 - l.l_discount) - ps.ps_supplycost * l.l_quantity as amount\n"
					+ "    from\n"
					+ "      tpch.part p,\n"
					+ "      tpch.supplier s,\n"
					+ "      tpch.lineitem l,\n"
					+ "      tpch.partsupp ps,\n"
					+ "      tpch.orders o,\n"
					+ "      tpch.nation n\n"
					+ "    where\n"
					+ "      s.s_suppkey = l.l_suppkey\n"
					+ "      and ps.ps_suppkey = l.l_suppkey\n"
					+ "      and ps.ps_partkey = l.l_partkey\n"
					+ "      and p.p_partkey = l.l_partkey\n"
					+ "      and o.o_orderkey = l.l_orderkey\n"
					+ "      and s.s_nationkey = n.n_nationkey\n"
					+ "      and p.p_name like '%yellow%'\n"
					+ "  ) as profit\n"
					+ "group by\n"
					+ "  nation,\n"
					+ "  o_year\n"
					+ "order by\n"
					+ "  nation,\n"
					+ "  o_year desc",

			// 10
			"select\n"
					+ "  c.c_custkey,\n"
					+ "  c.c_name,\n"
					+ "  sum(l.l_extendedprice * (1 - l.l_discount)) as revenue,\n"
					+ "  c.c_acctbal,\n"
					+ "  n.n_name,\n"
					+ "  c.c_address,\n"
					+ "  c.c_phone,\n"
					+ "  c.c_comment\n"
					+ "from\n"
					+ "  tpch.customer c,\n"
					+ "  tpch.orders o,\n"
					+ "  tpch.lineitem l,\n"
					+ "  tpch.nation n\n"
					+ "where\n"
					+ "  c.c_custkey = o.o_custkey\n"
					+ "  and l.l_orderkey = o.o_orderkey\n"
					+ "  and o.o_orderdate >= date '1994-03-01'\n"
					+ "  and o.o_orderdate < date '1994-03-01' + interval '3' month\n"
					+ "  and l.l_returnflag = 'R'\n"
					+ "  and c.c_nationkey = n.n_nationkey\n"
					+ "group by\n"
					+ "  c.c_custkey,\n"
					+ "  c.c_name,\n"
					+ "  c.c_acctbal,\n"
					+ "  c.c_phone,\n"
					+ "  n.n_name,\n"
					+ "  c.c_address,\n"
					+ "  c.c_comment\n"
					+ "order by\n"
					+ "  revenue desc\n"
					+ "limit 20",

			// 11
			"select\n"
					+ "  ps.ps_partkey,\n"
					+ "  sum(ps.ps_supplycost * ps.ps_availqty) as \"value\"\n"
					+ "from\n"
					+ "  tpch.partsupp ps,\n"
					+ "  tpch.supplier s,\n"
					+ "  tpch.nation n\n"
					+ "where\n"
					+ "  ps.ps_suppkey = s.s_suppkey\n"
					+ "  and s.s_nationkey = n.n_nationkey\n"
					+ "  and n.n_name = 'JAPAN'\n"
					+ "group by\n"
					+ "  ps.ps_partkey having\n"
					+ "    sum(ps.ps_supplycost * ps.ps_availqty) > (\n"
					+ "      select\n"
					+ "        sum(ps.ps_supplycost * ps.ps_availqty) * 0.0001000000\n"
					+ "      from\n"
					+ "        tpch.partsupp ps,\n"
					+ "        tpch.supplier s,\n"
					+ "        tpch.nation n\n"
					+ "      where\n"
					+ "        ps.ps_suppkey = s.s_suppkey\n"
					+ "        and s.s_nationkey = n.n_nationkey\n"
					+ "        and n.n_name = 'JAPAN'\n"
					+ "    )\n"
					+ "order by\n"
					+ "  \"value\" desc",

			// 12
			"select\n"
					+ "  l.l_shipmode,\n"
					+ "  sum(case\n"
					+ "    when o.o_orderpriority = '1-URGENT'\n"
					+ "      or o.o_orderpriority = '2-HIGH'\n"
					+ "      then 1\n"
					+ "    else 0\n"
					+ "  end) as high_line_count,\n"
					+ "  sum(case\n"
					+ "    when o.o_orderpriority <> '1-URGENT'\n"
					+ "      and o.o_orderpriority <> '2-HIGH'\n"
					+ "      then 1\n"
					+ "    else 0\n"
					+ "  end) as low_line_count\n"
					+ "from\n"
					+ "  tpch.orders o,\n"
					+ "  tpch.lineitem l\n"
					+ "where\n"
					+ "  o.o_orderkey = l.l_orderkey\n"
					+ "  and l.l_shipmode in ('TRUCK', 'REG AIR')\n"
					+ "  and l.l_commitdate < l.l_receiptdate\n"
					+ "  and l.l_shipdate < l.l_commitdate\n"
					+ "--  and l.l_receiptdate >= date '1994-01-01'\n"
					+ "--  and l.l_receiptdate < date '1994-01-01' + interval '1' year\n"
					+ "group by\n"
					+ "  l.l_shipmode\n"
					+ "order by\n"
					+ "  l.l_shipmode",

			// 13
			"select\n"
					+ "  c_count,\n"
					+ "  count(*) as custdist\n"
					+ "from\n"
					+ "  (\n"
					+ "    select\n"
					+ "      c.c_custkey,\n"
					+ "      count(o.o_orderkey)\n"
					+ "    from\n"
					+ "      tpch.customer c \n"
					+ "      left outer join tpch.orders o \n"
					+ "        on c.c_custkey = o.o_custkey\n"
					+ "        and o.o_comment not like '%special%requests%'\n"
					+ "    group by\n"
					+ "      c.c_custkey\n"
					+ "  ) as orders (c_custkey, c_count)\n"
					+ "group by\n"
					+ "  c_count\n"
					+ "order by\n"
					+ "  custdist desc,\n"
					+ "  c_count desc",

			// 14
			"select\n"
					+ "  100.00 * sum(case\n"
					+ "    when p.p_type like 'PROMO%'\n"
					+ "      then l.l_extendedprice * (1 - l.l_discount)\n"
					+ "    else 0\n"
					+ "  end) / sum(l.l_extendedprice * (1 - l.l_discount)) as promo_revenue\n"
					+ "from\n"
					+ "  tpch.lineitem l,\n"
					+ "  tpch.part p\n"
					+ "where\n"
					+ "  l.l_partkey = p.p_partkey\n"
					+ "  and l.l_shipdate >= date '1994-08-01'\n"
					+ "  and l.l_shipdate < date '1994-08-01' + interval '1' month",

			// 15
			"with revenue0 (supplier_no, total_revenue) as (\n"
					+ "  select\n"
					+ "    l_suppkey,\n"
					+ "    sum(l_extendedprice * (1 - l_discount))\n"
					+ "  from\n"
					+ "    tpch.lineitem\n"
					+ "  where\n"
					+ "    l_shipdate >= date '1993-05-01'\n"
					+ "    and l_shipdate < date '1993-05-01' + interval '3' month\n"
					+ "  group by\n"
					+ "    l_suppkey)\n"
					+ "select\n"
					+ "  s.s_suppkey,\n"
					+ "  s.s_name,\n"
					+ "  s.s_address,\n"
					+ "  s.s_phone,\n"
					+ "  r.total_revenue\n"
					+ "from\n"
					+ "  tpch.supplier s,\n"
					+ "  revenue0 r\n"
					+ "where\n"
					+ "  s.s_suppkey = r.supplier_no\n"
					+ "  and r.total_revenue = (\n"
					+ "    select\n"
					+ "      max(total_revenue)\n"
					+ "    from\n"
					+ "      revenue0\n"
					+ "  )\n"
					+ "order by\n"
					+ "  s.s_suppkey",

			// 16
			"select\n"
					+ "  p.p_brand,\n"
					+ "  p.p_type,\n"
					+ "  p.p_size,\n"
					+ "  count(distinct ps.ps_suppkey) as supplier_cnt\n"
					+ "from\n"
					+ "  tpch.partsupp ps,\n"
					+ "  tpch.part p\n"
					+ "where\n"
					+ "  p.p_partkey = ps.ps_partkey\n"
					+ "  and p.p_brand <> 'Brand#21'\n"
					+ "  and p.p_type not like 'MEDIUM PLATED%'\n"
					+ "  and p.p_size in (38, 2, 8, 31, 44, 5, 14, 24)\n"
					+ "  and ps.ps_suppkey not in (\n"
					+ "    select\n"
					+ "      s_suppkey\n"
					+ "    from\n"
					+ "      tpch.supplier\n"
					+ "    where\n"
					+ "      s_comment like '%Customer%Complaints%'\n"
					+ "  )\n"
					+ "group by\n"
					+ "  p.p_brand,\n"
					+ "  p.p_type,\n"
					+ "  p.p_size\n"
					+ "order by\n"
					+ "  supplier_cnt desc,\n"
					+ "  p.p_brand,\n"
					+ "  p.p_type,\n"
					+ "  p.p_size",

			// 17
			"select\n"
					+ "  sum(l.l_extendedprice) / 7.0 as avg_yearly\n"
					+ "from\n"
					+ "  tpch.lineitem l,\n"
					+ "  tpch.part p\n"
					+ "where\n"
					+ "  p.p_partkey = l.l_partkey\n"
					+ "  and p.p_brand = 'Brand#13'\n"
					+ "  and p.p_container = 'JUMBO CAN'\n"
					+ "  and l.l_quantity < (\n"
					+ "    select\n"
					+ "      0.2 * avg(l2.l_quantity)\n"
					+ "    from\n"
					+ "      tpch.lineitem l2\n"
					+ "    where\n"
					+ "      l2.l_partkey = p.p_partkey\n"
					+ "  )",

			// 18
			"select\n"
					+ "  c.c_name,\n"
					+ "  c.c_custkey,\n"
					+ "  o.o_orderkey,\n"
					+ "  o.o_orderdate,\n"
					+ "  o.o_totalprice,\n"
					+ "  sum(l.l_quantity)\n"
					+ "from\n"
					+ "  tpch.customer c,\n"
					+ "  tpch.orders o,\n"
					+ "  tpch.lineitem l\n"
					+ "where\n"
					+ "  o.o_orderkey in (\n"
					+ "    select\n"
					+ "      l_orderkey\n"
					+ "    from\n"
					+ "      tpch.lineitem\n"
					+ "    group by\n"
					+ "      l_orderkey having\n"
					+ "        sum(l_quantity) > 313\n"
					+ "  )\n"
					+ "  and c.c_custkey = o.o_custkey\n"
					+ "  and o.o_orderkey = l.l_orderkey\n"
					+ "group by\n"
					+ "  c.c_name,\n"
					+ "  c.c_custkey,\n"
					+ "  o.o_orderkey,\n"
					+ "  o.o_orderdate,\n"
					+ "  o.o_totalprice\n"
					+ "order by\n"
					+ "  o.o_totalprice desc,\n"
					+ "  o.o_orderdate\n"
					+ "limit 100",

			// 19
			"select\n"
					+ "  sum(l.l_extendedprice* (1 - l.l_discount)) as revenue\n"
					+ "from\n"
					+ "  tpch.lineitem l,\n"
					+ "  tpch.part p\n"
					+ "where\n"
					+ "  (\n"
					+ "    p.p_partkey = l.l_partkey\n"
					+ "    and p.p_brand = 'Brand#41'\n"
					+ "    and p.p_container in ('SM CASE', 'SM BOX', 'SM PACK', 'SM PKG')\n"
					+ "    and l.l_quantity >= 2 and l.l_quantity <= 2 + 10\n"
					+ "    and p.p_size between 1 and 5\n"
					+ "    and l.l_shipmode in ('AIR', 'AIR REG')\n"
					+ "    and l.l_shipinstruct = 'DELIVER IN PERSON'\n"
					+ "  )\n"
					+ "  or\n"
					+ "  (\n"
					+ "    p.p_partkey = l.l_partkey\n"
					+ "    and p.p_brand = 'Brand#13'\n"
					+ "    and p.p_container in ('MED BAG', 'MED BOX', 'MED PKG', 'MED PACK')\n"
					+ "    and l.l_quantity >= 14 and l.l_quantity <= 14 + 10\n"
					+ "    and p.p_size between 1 and 10\n"
					+ "    and l.l_shipmode in ('AIR', 'AIR REG')\n"
					+ "    and l.l_shipinstruct = 'DELIVER IN PERSON'\n"
					+ "  )\n"
					+ "  or\n"
					+ "  (\n"
					+ "    p.p_partkey = l.l_partkey\n"
					+ "    and p.p_brand = 'Brand#55'\n"
					+ "    and p.p_container in ('LG CASE', 'LG BOX', 'LG PACK', 'LG PKG')\n"
					+ "    and l.l_quantity >= 23 and l.l_quantity <= 23 + 10\n"
					+ "    and p.p_size between 1 and 15\n"
					+ "    and l.l_shipmode in ('AIR', 'AIR REG')\n"
					+ "    and l.l_shipinstruct = 'DELIVER IN PERSON'\n"
					+ "  )",

			// 20
			"select\n"
					+ "  s.s_name,\n"
					+ "  s.s_address\n"
					+ "from\n"
					+ "  tpch.supplier s,\n"
					+ "  tpch.nation n\n"
					+ "where\n"
					+ "  s.s_suppkey in (\n"
					+ "    select\n"
					+ "      ps.ps_suppkey\n"
					+ "    from\n"
					+ "      tpch.partsupp ps\n"
					+ "    where\n"
					+ "      ps. ps_partkey in (\n"
					+ "        select\n"
					+ "          p.p_partkey\n"
					+ "        from\n"
					+ "          tpch.part p\n"
					+ "        where\n"
					+ "          p.p_name like 'antique%'\n"
					+ "      )\n"
					+ "      and ps.ps_availqty > (\n"
					+ "        select\n"
					+ "          0.5 * sum(l.l_quantity)\n"
					+ "        from\n"
					+ "          tpch.lineitem l\n"
					+ "        where\n"
					+ "          l.l_partkey = ps.ps_partkey\n"
					+ "          and l.l_suppkey = ps.ps_suppkey\n"
					+ "          and l.l_shipdate >= date '1993-01-01'\n"
					+ "          and l.l_shipdate < date '1993-01-01' + interval '1' year\n"
					+ "      )\n"
					+ "  )\n"
					+ "  and s.s_nationkey = n.n_nationkey\n"
					+ "  and n.n_name = 'KENYA'\n"
					+ "order by\n"
					+ "  s.s_name",

			// 21
			"select\n"
					+ "  s.s_name,\n"
					+ "  count(*) as numwait\n"
					+ "from\n"
					+ "  tpch.supplier s,\n"
					+ "  tpch.lineitem l1,\n"
					+ "  tpch.orders o,\n"
					+ "  tpch.nation n\n"
					+ "where\n"
					+ "  s.s_suppkey = l1.l_suppkey\n"
					+ "  and o.o_orderkey = l1.l_orderkey\n"
					+ "  and o.o_orderstatus = 'F'\n"
					+ "  and l1.l_receiptdate > l1.l_commitdate\n"
					+ "  and exists (\n"
					+ "    select\n"
					+ "      *\n"
					+ "    from\n"
					+ "      tpch.lineitem l2\n"
					+ "    where\n"
					+ "      l2.l_orderkey = l1.l_orderkey\n"
					+ "      and l2.l_suppkey <> l1.l_suppkey\n"
					+ "  )\n"
					+ "  and not exists (\n"
					+ "    select\n"
					+ "      *\n"
					+ "    from\n"
					+ "      tpch.lineitem l3\n"
					+ "    where\n"
					+ "      l3.l_orderkey = l1.l_orderkey\n"
					+ "      and l3.l_suppkey <> l1.l_suppkey\n"
					+ "      and l3.l_receiptdate > l3.l_commitdate\n"
					+ "  )\n"
					+ "  and s.s_nationkey = n.n_nationkey\n"
					+ "  and n.n_name = 'BRAZIL'\n"
					+ "group by\n"
					+ "  s.s_name\n"
					+ "order by\n"
					+ "  numwait desc,\n"
					+ "  s.s_name\n"
					+ "limit 100",

			// 22
			"select\n"
					+ "  cntrycode,\n"
					+ "  count(*) as numcust,\n"
					+ "  sum(c_acctbal) as totacctbal\n"
					+ "from\n"
					+ "  (\n"
					+ "    select\n"
					+ "      substring(c_phone from 1 for 2) as cntrycode,\n"
					+ "      c_acctbal\n"
					+ "    from\n"
					+ "      tpch.customer c\n"
					+ "    where\n"
					+ "      substring(c_phone from 1 for 2) in\n"
					+ "        ('24', '31', '11', '16', '21', '20', '34')\n"
					+ "      and c_acctbal > (\n"
					+ "        select\n"
					+ "          avg(c_acctbal)\n"
					+ "        from\n"
					+ "          tpch.customer\n"
					+ "        where\n"
					+ "          c_acctbal > 0.00\n"
					+ "          and substring(c_phone from 1 for 2) in\n"
					+ "            ('24', '31', '11', '16', '21', '20', '34')\n"
					+ "      )\n"
					+ "      and not exists (\n"
					+ "        select\n"
					+ "          *\n"
					+ "        from\n"
					+ "          tpch.orders o\n"
					+ "        where\n"
					+ "          o.o_custkey = c.c_custkey\n"
					+ "      )\n"
					+ "  ) as custsale\n"
					+ "group by\n"
					+ "  cntrycode\n"
					+ "order by\n"
					+ "  cntrycode");

	private CalciteAssert.AssertThat with()
	{
		return CalciteAssert.model(TPCH_MODEL)
				.enable(true);
	}

	private void runQuery(int i)
	{
		with().query(QUERIES.get(i - 1).replaceAll("tpch\\.", "tpch_01."))
				.runs();
	}

	@Setup
	public void setUp()
	{
		// Data generation
		TpchSchema.generateTpchData(1D, 1, 1, true);

	}

	@Benchmark
	public void benchmarkQuery01()
	{
		runQuery(1);
	}

	@Benchmark
	public void benchmarkQuery02()
	{
		runQuery(2);
	}

	@Benchmark
	public void benchmarkQuery03()
	{
		runQuery(3);
	}

	@Benchmark
	public void benchmarkQuery04()
	{
		runQuery(4);
	}

	@Benchmark
	public void benchmarkQuery05()
	{
		runQuery(5);
	}

	@Benchmark
	public void benchmarkQuery06()
	{
		runQuery(6);
	}

	@Benchmark
	public void benchmarkQuery07()
	{
		runQuery(7);
	}

	@Benchmark
	public void benchmarkQuery08()
	{
		runQuery(8);
	}

	@Benchmark
	public void benchmarkQuery09()
	{
		runQuery(9);
	}

	@Benchmark
	public void benchmarkQuery10()
	{
		runQuery(10);
	}

	@Benchmark
	public void benchmarkQuery11()
	{
		runQuery(11);
	}

	@Benchmark
	public void benchmarkQuery12()
	{
		runQuery(12);
	}

	@Benchmark
	public void benchmarkQuery13()
	{
		runQuery(13);
	}

	@Benchmark
	public void benchmarkQuery14()
	{
		runQuery(14);
	}

	@Benchmark
	public void benchmarkQuery15()
	{
		runQuery(15);
	}

	@Benchmark
	public void benchmarkQuery16()
	{
		runQuery(16);
	}

	@Benchmark
	public void benchmarkQuery17()
	{
		runQuery(17);
	}

	@Benchmark
	public void benchmarkQuery18()
	{
		runQuery(18);
	}

	@Benchmark
	public void benchmarkQuery19()
	{
		runQuery(19);
	}

	@Benchmark
	public void benchmarkQuery20()
	{
		runQuery(20);
	}

	@Benchmark
	public void benchmarkQuery21()
	{
		runQuery(21);
	}

	@Benchmark
	public void benchmarkQuery22()
	{
		runQuery(22);
	}
}

Reply via email to