[jira] [Comment Edited] (CALCITE-963) Hoist literals

2019-09-18 Thread Scott Reynolds (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16932829#comment-16932829
 ] 

Scott Reynolds edited comment on CALCITE-963 at 9/18/19 9:14 PM:
-

h1. Goal

When a query is issued to Calcite it is parsed, optimized and then generates a 
Java Class that implements {{Bindable}}. {{EnumerableInterpretable}}. This 
class is then checked to see if it exists in {{com.google.common.cache}} and if 
it doesn't it will call into a Java compiler. Compilation process can take a 
considerable amount of time, Apache Kylin reported 50 to 150ms of additional 
computation time. Today, Apache Calcite will generate unique Java Class strings 
whenever any part of the query changes. This document details out the design 
and implementation of a hoisting technique within Apache Calcite. This design 
and implementation greatly increases the cache hit rate of 
{{EnumerableInterpretable}}'s {{BINDABLE_CACHE}}.
h1. Non Goals

This implementation is not designed to change the planning process. It does not 
transform {{RexLiteral}} into {{RexDynamicParam}}, and doesn't change the cost 
calculation of the query.
h1. Implementation Details

After a query has been optimized there are three phases that remaining phases 
to the query:
 # Generating the Java code
 # Binding Hoisted Variables
 # Runtime execution via {{Bindable.bind(DataContext, HoistedVariables)}}

Each of these phases will interact with a new class called {{HoistedVariables}}

!HoistedVariables.png!

Each of these methods are used in the above three phases to hoist a variable 
from within the query into the runtime execution of the {{Bindable}}.

The method {{implement}} of the interface {{EnumerableRel}} is used to generate 
the Java code in phase one. Each of these {{RelNode}} can now call 
{{registerVariable(String)}} to allocate a {{Slot}} for their unbound value. 
This {{Slot}} is reserved for their use and is unique for the query plan. When 
a {{RelNode}} registers a variable it needs to save that {{Slot}} into a 
property so it can be referenced in phase 2. This {{Slot}} is then referenced 
in code generation by calling {{EnumerableRel.lookupValue}} which returns an 
{{Expression}} that will extract the bound value at for the {{Slot}}.

Below is a snippet from {{EnumerableLimit}} implementation of {{implement}} 
that uses {{HoistedVariables}}.
{code:java}
Expression v = builder.append("child", result.block);
if (offset != null) {
  if (offset instanceof RexDynamicParam) {
v = getDynamicExpression((RexDynamicParam) offset);
  } else {
// Register with Hoisted Variable here
offsetIndex = variables.registerVariable("offset");
v = builder.append(
"offset",
Expressions.call(
  v,
  BuiltInMethod.SKIP.method,
  // At runtime, fetch the bound variable. This returns the Java code to do 
that.
  EnumerableRel.lookupValue(offsetIndex, Integer.class)));
  }
}
if (fetch != null) {
  if (fetch instanceof RexDynamicParam) {
v = getDynamicExpression((RexDynamicParam) fetch);
  } else {
// Register with Hoisted Variable here
this.fetchIndex = variables.registerVariable("fetch");
v = builder.append(
  "fetch",
  Expressions.call(
  v,
  BuiltInMethod.TAKE.method,
  // At runtime, fetch the bound variable. This returns the Java code to do 
that.
  EnumerableRel.lookupValue(fetchIndex, Integer.class)));
  }
}
{code}
The second phase of the query execution is where registered {{Slots}} get 
bound. To this, our change adds a new optional method to {{Bindable}} called 
{{hoistVariables}}. This method is where an instance of {{EnumerableRel}} 
extracts the values out of the query plan and binds them into the 
{{HoistedVariables}} instance just prior to executing the query. Below is 
{{EnumerableLimit}} implementation:
{code:java}
@Override public void hoistedVariables(HoistedVariables variables) {
  getInputs()
  .stream()
  .forEach(rel -> {
final EnumerableRel enumerable = (EnumerableRel) rel;
enumerable.hoistedVariables(variables);
  });
  if (fetchIndex != null) {
// fetchIndex is the registered slot for this variable. Bind fetchIndex to 
fetch
variables.setVariable(fetchIndex, RexLiteral.intValue(fetch));
  }
  if (offsetIndex != null) {
// offsetIndex is the registered slot for this variable. Bind offsetIndex 
to offset.
variables.setVariable(offsetIndex, RexLiteral.intValue(offset));
  }
}
{code}
To tie these three phases together, {{CalcitePrepareImpl}} needs to setup the 
variables when it creates a {{PreparedResult}}:
{code:java}
try {
  CatalogReader.THREAD_LOCAL.set(catalogReader);
  final SqlConformance conformance = context.config().conformance();
  internalParameters.put("_conformance", conformance);
  // Get the compiled Bindable instance either from cache or generate a new one.
  bindable = 

[jira] [Comment Edited] (CALCITE-963) Hoist literals

2019-09-18 Thread Scott Reynolds (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16932829#comment-16932829
 ] 

Scott Reynolds edited comment on CALCITE-963 at 9/18/19 9:09 PM:
-

h1. Goal

When a query is issued to Calcite it is parsed, optimized and then generates a 
Java Class that implements {{Bindable}}. {{EnumerableInterpretable. This class 
is then checked to see if it exists in {{com.google.common.cache and if it 
doesn't it will call into a Java compiler. Compilation process can take a 
considerable amount of time, Apache Kylin reported 50 to 150ms of additional 
computation time. Today, Apache Calcite will generate unique Java Class strings 
whenever any part of the query changes. This document details out the design 
and implementation of a hoisting technique within Apache Calcite. This design 
and implementation greatly increases the cache hit rate of 
{{EnumerableInterpretable}}'s {{BINDABLE_CACHE}}.
h1. Non Goals

This implementation is not designed to change the planning process. It does not 
transform {{RexLiteral}} into {{RexDynamicParam}}, and doesn't change the cost 
calculation of the query.
h1. Implementation Details

After a query has been optimized there are three phases that remaining phases 
to the query:
 # Generating the Java code
 # Binding Hoisted Variables
 # Runtime execution via {{Bindable.bind(DataContext, HoistedVariables)}}

Each of these phases will interact with a new class called {{HoistedVariables}}

!HoistedVariables.png!

Each of these methods are used in the above three phases to hoist a variable 
from within the query into the runtime execution of the {{Bindable}}.

The method {{implement}} of the interface {{EnumerableRel}} is used to generate 
the Java code in phase one. Each of these {{RelNode}} can now call 
{{registerVariable(String)}} to allocate a {{Slot}} for their unbound value. 
This {{Slot}} is reserved for their use and is unique for the query plan. When 
a {{RelNode}} registers a variable it needs to save that {{Slot}} into a 
property so it can be referenced in phase 2. This {{Slot}} is then referenced 
in code generation by calling {{EnumerableRel.lookupValue}} which returns an 
{{Expression}} that will extract the bound value at for the {{Slot}}.

Below is a snippet from {{EnumerableLimit}} implementation of {{implement}} 
that uses {{HoistedVariables}}.
{code:java}
Expression v = builder.append("child", result.block);
if (offset != null) {
  if (offset instanceof RexDynamicParam) {
v = getDynamicExpression((RexDynamicParam) offset);
  } else {
// Register with Hoisted Variable here
offsetIndex = variables.registerVariable("offset");
v = builder.append(
"offset",
Expressions.call(
  v,
  BuiltInMethod.SKIP.method,
  // At runtime, fetch the bound variable. This returns the Java code to do 
that.
  EnumerableRel.lookupValue(offsetIndex, Integer.class)));
  }
}
if (fetch != null) {
  if (fetch instanceof RexDynamicParam) {
v = getDynamicExpression((RexDynamicParam) fetch);
  } else {
// Register with Hoisted Variable here
this.fetchIndex = variables.registerVariable("fetch");
v = builder.append(
  "fetch",
  Expressions.call(
  v,
  BuiltInMethod.TAKE.method,
  // At runtime, fetch the bound variable. This returns the Java code to do 
that.
  EnumerableRel.lookupValue(fetchIndex, Integer.class)));
  }
}
{code}
The second phase of the query execution is where registered {{Slots}} get 
bound. To this, our change adds a new optional method to {{Bindable}} called 
{{hoistVariables}}. This method is where an instance of {{EnumerableRel}} 
extracts the values out of the query plan and binds them into the 
{{HoistedVariables}} instance just prior to executing the query. Below is 
{{EnumerableLimit}} implementation:
{code:java}
@Override public void hoistedVariables(HoistedVariables variables) {
  getInputs()
  .stream()
  .forEach(rel -> {
final EnumerableRel enumerable = (EnumerableRel) rel;
enumerable.hoistedVariables(variables);
  });
  if (fetchIndex != null) {
// fetchIndex is the registered slot for this variable. Bind fetchIndex to 
fetch
variables.setVariable(fetchIndex, RexLiteral.intValue(fetch));
  }
  if (offsetIndex != null) {
// offsetIndex is the registered slot for this variable. Bind offsetIndex 
to offset.
variables.setVariable(offsetIndex, RexLiteral.intValue(offset));
  }
}
{code}
To tie these three phases together, {{CalcitePrepareImpl}} needs to setup the 
variables when it creates a {{PreparedResult}}:
{code:java}
try {
  CatalogReader.THREAD_LOCAL.set(catalogReader);
  final SqlConformance conformance = context.config().conformance();
  internalParameters.put("_conformance", conformance);
  // Get the compiled Bindable instance either from cache or generate a new one.
  bindable = 

[jira] [Comment Edited] (CALCITE-963) Hoist literals

2019-09-18 Thread Scott Reynolds (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16932829#comment-16932829
 ] 

Scott Reynolds edited comment on CALCITE-963 at 9/18/19 9:09 PM:
-

h1. Goal

When a query is issued to Calcite it is parsed, optimized and then generates a 
Java Class that implements {{Bindable}}. {{EnumerableInterpretable. }}This 
class is then checked to see if it exists in {{com.google.common.cache}} and if 
it doesn't it will call into a Java compiler. Compilation process can take a 
considerable amount of time, Apache Kylin reported 50 to 150ms of additional 
computation time. Today, Apache Calcite will generate unique Java Class strings 
whenever any part of the query changes. This document details out the design 
and implementation of a hoisting technique within Apache Calcite. This design 
and implementation greatly increases the cache hit rate of 
{{EnumerableInterpretable}}'s {{BINDABLE_CACHE}}.
h1. Non Goals

This implementation is not designed to change the planning process. It does not 
transform {{RexLiteral}} into {{RexDynamicParam}}, and doesn't change the cost 
calculation of the query.
h1. Implementation Details

After a query has been optimized there are three phases that remaining phases 
to the query:
 # Generating the Java code
 # Binding Hoisted Variables
 # Runtime execution via {{Bindable.bind(DataContext, HoistedVariables)}}

Each of these phases will interact with a new class called {{HoistedVariables}}

!HoistedVariables.png!

Each of these methods are used in the above three phases to hoist a variable 
from within the query into the runtime execution of the {{Bindable}}.

The method {{implement}} of the interface {{EnumerableRel}} is used to generate 
the Java code in phase one. Each of these {{RelNode}} can now call 
{{registerVariable(String)}} to allocate a {{Slot}} for their unbound value. 
This {{Slot}} is reserved for their use and is unique for the query plan. When 
a {{RelNode}} registers a variable it needs to save that {{Slot}} into a 
property so it can be referenced in phase 2. This {{Slot}} is then referenced 
in code generation by calling {{EnumerableRel.lookupValue}} which returns an 
{{Expression}} that will extract the bound value at for the {{Slot}}.

Below is a snippet from {{EnumerableLimit}} implementation of {{implement}} 
that uses {{HoistedVariables}}.
{code:java}
Expression v = builder.append("child", result.block);
if (offset != null) {
  if (offset instanceof RexDynamicParam) {
v = getDynamicExpression((RexDynamicParam) offset);
  } else {
// Register with Hoisted Variable here
offsetIndex = variables.registerVariable("offset");
v = builder.append(
"offset",
Expressions.call(
  v,
  BuiltInMethod.SKIP.method,
  // At runtime, fetch the bound variable. This returns the Java code to do 
that.
  EnumerableRel.lookupValue(offsetIndex, Integer.class)));
  }
}
if (fetch != null) {
  if (fetch instanceof RexDynamicParam) {
v = getDynamicExpression((RexDynamicParam) fetch);
  } else {
// Register with Hoisted Variable here
this.fetchIndex = variables.registerVariable("fetch");
v = builder.append(
  "fetch",
  Expressions.call(
  v,
  BuiltInMethod.TAKE.method,
  // At runtime, fetch the bound variable. This returns the Java code to do 
that.
  EnumerableRel.lookupValue(fetchIndex, Integer.class)));
  }
}
{code}
The second phase of the query execution is where registered {{Slots}} get 
bound. To this, our change adds a new optional method to {{Bindable}} called 
{{hoistVariables}}. This method is where an instance of {{EnumerableRel}} 
extracts the values out of the query plan and binds them into the 
{{HoistedVariables}} instance just prior to executing the query. Below is 
{{EnumerableLimit}} implementation:
{code:java}
@Override public void hoistedVariables(HoistedVariables variables) {
  getInputs()
  .stream()
  .forEach(rel -> {
final EnumerableRel enumerable = (EnumerableRel) rel;
enumerable.hoistedVariables(variables);
  });
  if (fetchIndex != null) {
// fetchIndex is the registered slot for this variable. Bind fetchIndex to 
fetch
variables.setVariable(fetchIndex, RexLiteral.intValue(fetch));
  }
  if (offsetIndex != null) {
// offsetIndex is the registered slot for this variable. Bind offsetIndex 
to offset.
variables.setVariable(offsetIndex, RexLiteral.intValue(offset));
  }
}
{code}
To tie these three phases together, {{CalcitePrepareImpl}} needs to setup the 
variables when it creates a {{PreparedResult}}:
{code:java}
try {
  CatalogReader.THREAD_LOCAL.set(catalogReader);
  final SqlConformance conformance = context.config().conformance();
  internalParameters.put("_conformance", conformance);
  // Get the compiled Bindable instance either from cache or generate a new one.
  bindable = 

[jira] [Comment Edited] (CALCITE-963) Hoist literals

2019-09-18 Thread Scott Reynolds (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16932829#comment-16932829
 ] 

Scott Reynolds edited comment on CALCITE-963 at 9/18/19 8:57 PM:
-

h1. Goal

When a query is issued to Calcite it is parsed, optimized and then generates a 
String of Java Class that implements {{Bindable}}. {{EnumerableInterpretable}} 
creates this string and checks to see if that string exists in 
{{com.google.common.cache}} and if it doesn't it will call into a Java 
compiler. Compilation process can take a considerable amount of time, Apache 
Kylin reported 50 to 150ms of additional computation time. Today, Apache 
Calcite will generate unique Java Class strings whenever any part of the query 
changes. This document details out the design and implementation of a hoisting 
technique within Apache Calcite. This design and implementation greatly 
increases the cache hit rate of {{EnumerableInterpretable}}'s 
{{BINDABLE_CACHE}}.
h1. Non Goals

This implementation is not designed to change the planning process. It does not 
transform {{RexLiteral}} into {{RexDynamicParam}}, and doesn't change the cost 
calculation of the query.
h1. Implementation Details

After a query has been optimized there are three phases that remaining phases 
to the query:
 # Generating the Java code
 # Binding Hoisted Variables
 # Runtime execution via {{Bindable.bind(DataContext, HoistedVariables)}}

Each of these phases will interact with a new class called {{HoistedVariables}}

!HoistedVariables.png!

Each of these methods are used in the above three phases to hoist a variable 
from within the query into the runtime execution of the {{Bindable}}.

The method {{implement}} of the interface {{EnumerableRel}} is used to generate 
the Java code in phase one. Each of these {{RelNode}} can now call 
{{registerVariable(String)}} to allocate a {{Slot}} for their unbound value. 
This {{Slot}} is reserved for their use and is unique for the query plan. When 
a {{RelNode}} registers a variable it needs to save that {{Slot}} into a 
property so it can be referenced in phase 2. This {{Slot}} is then referenced 
in code generation by calling {{EnumerableRel.lookupValue}} which returns an 
{{Expression}} that will extract the bound value at for the {{Slot}}.

Below is a snippet from {{EnumerableLimit}} implementation of {{implement}} 
that uses {{HoistedVariables}}.
{code:java}
Expression v = builder.append("child", result.block);
if (offset != null) {
  if (offset instanceof RexDynamicParam) {
v = getDynamicExpression((RexDynamicParam) offset);
  } else {
// Register with Hoisted Variable here
offsetIndex = variables.registerVariable("offset");
v = builder.append(
"offset",
Expressions.call(
  v,
  BuiltInMethod.SKIP.method,
  // At runtime, fetch the bound variable. This returns the Java code to do 
that.
  EnumerableRel.lookupValue(offsetIndex, Integer.class)));
  }
}
if (fetch != null) {
  if (fetch instanceof RexDynamicParam) {
v = getDynamicExpression((RexDynamicParam) fetch);
  } else {
// Register with Hoisted Variable here
this.fetchIndex = variables.registerVariable("fetch");
v = builder.append(
  "fetch",
  Expressions.call(
  v,
  BuiltInMethod.TAKE.method,
  // At runtime, fetch the bound variable. This returns the Java code to do 
that.
  EnumerableRel.lookupValue(fetchIndex, Integer.class)));
  }
}
{code}
The second phase of the query execution is where registered {{Slots}} get 
bound. To this, our change adds a new optional method to {{Bindable}} called 
{{hoistVariables}}. This method is where an instance of {{EnumerableRel}} 
extracts the values out of the query plan and binds them into the 
{{HoistedVariables}} instance just prior to executing the query. Below is 
{{EnumerableLimit}} implementation:
{code:java}
@Override public void hoistedVariables(HoistedVariables variables) {
  getInputs()
  .stream()
  .forEach(rel -> {
final EnumerableRel enumerable = (EnumerableRel) rel;
enumerable.hoistedVariables(variables);
  });
  if (fetchIndex != null) {
// fetchIndex is the registered slot for this variable. Bind fetchIndex to 
fetch
variables.setVariable(fetchIndex, RexLiteral.intValue(fetch));
  }
  if (offsetIndex != null) {
// offsetIndex is the registered slot for this variable. Bind offsetIndex 
to offset.
variables.setVariable(offsetIndex, RexLiteral.intValue(offset));
  }
}
{code}
To tie these three phases together, {{CalcitePrepareImpl}} needs to setup the 
variables when it creates a {{PreparedResult}}:
{code:java}
try {
  CatalogReader.THREAD_LOCAL.set(catalogReader);
  final SqlConformance conformance = context.config().conformance();
  internalParameters.put("_conformance", conformance);
  // Get the compiled Bindable instance either from cache or generate a new one.
  

[jira] [Comment Edited] (CALCITE-963) Hoist literals

2019-09-18 Thread Scott Reynolds (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16932829#comment-16932829
 ] 

Scott Reynolds edited comment on CALCITE-963 at 9/18/19 8:56 PM:
-

h1. Goal

When a query is issued to Calcite it is parsed, optimized and then generates a 
String of Java Class that implements {{Bindable}}. {{EnumerableInterpretable}} 
creates this string and checks to see if that string exists in 
{{com.google.common.cache}} and if it doesn't it will call into a Java 
compiler. Compilation process can take a considerable amount of time, Apache 
Kylin reported 50 to 150ms of additional computation time. Today, Apache 
Calcite will generate unique Java Class strings whenever any part of the query 
changes. This document details out the design and implementation of a hoisting 
technique within Apache Calcite. This design and implementation greatly 
increases the cache hit rate of {{EnumerableInterpretable}}'s 
{{BINDABLE_CACHE}}.
h1. Non Goals

This implementation is not designed to change the planning process. It does not 
transform {{RexLiteral}} into {{RexDynamicParam}}, and doesn't change the cost 
calculation of the query.
h1. Implementation Details

After a query has been optimized there are three phases that remaining phases 
to the query:
 # Generating the Java code
 # Binding Hoisted Variables
 # Runtime execution via {{Bindable.bind(DataContext, HoistedVariables)}}

Each of these phases will interact with a new class called {{HoistedVariables}}



Each of these methods are used in the above three phases to hoist a variable 
from within the query into the runtime execution of the {{Bindable}}.

The method {{implement}} of the interface {{EnumerableRel}} is used to generate 
the Java code in phase one. Each of these {{RelNode}} can now call 
{{registerVariable(String)}} to allocate a {{Slot}} for their unbound value. 
This {{Slot}} is reserved for their use and is unique for the query plan. When 
a {{RelNode}} registers a variable it needs to save that {{Slot}} into a 
property so it can be referenced in phase 2. This {{Slot}} is then referenced 
in code generation by calling {{EnumerableRel.lookupValue}} which returns an 
{{Expression}} that will extract the bound value at for the {{Slot}}.

Below is a snippet from {{EnumerableLimit}} implementation of {{implement}} 
that uses {{HoistedVariables}}.
{code:java}
Expression v = builder.append("child", result.block);
if (offset != null) {
  if (offset instanceof RexDynamicParam) {
v = getDynamicExpression((RexDynamicParam) offset);
  } else {
// Register with Hoisted Variable here
offsetIndex = variables.registerVariable("offset");
v = builder.append(
"offset",
Expressions.call(
  v,
  BuiltInMethod.SKIP.method,
  // At runtime, fetch the bound variable. This returns the Java code to do 
that.
  EnumerableRel.lookupValue(offsetIndex, Integer.class)));
  }
}
if (fetch != null) {
  if (fetch instanceof RexDynamicParam) {
v = getDynamicExpression((RexDynamicParam) fetch);
  } else {
// Register with Hoisted Variable here
this.fetchIndex = variables.registerVariable("fetch");
v = builder.append(
  "fetch",
  Expressions.call(
  v,
  BuiltInMethod.TAKE.method,
  // At runtime, fetch the bound variable. This returns the Java code to do 
that.
  EnumerableRel.lookupValue(fetchIndex, Integer.class)));
  }
}
{code}
The second phase of the query execution is where registered {{Slots}} get 
bound. To this, our change adds a new optional method to {{Bindable}} called 
{{hoistVariables}}. This method is where an instance of {{EnumerableRel}} 
extracts the values out of the query plan and binds them into the 
{{HoistedVariables}} instance just prior to executing the query. Below is 
{{EnumerableLimit}} implementation:
{code:java}
@Override public void hoistedVariables(HoistedVariables variables) {
  getInputs()
  .stream()
  .forEach(rel -> {
final EnumerableRel enumerable = (EnumerableRel) rel;
enumerable.hoistedVariables(variables);
  });
  if (fetchIndex != null) {
// fetchIndex is the registered slot for this variable. Bind fetchIndex to 
fetch
variables.setVariable(fetchIndex, RexLiteral.intValue(fetch));
  }
  if (offsetIndex != null) {
// offsetIndex is the registered slot for this variable. Bind offsetIndex 
to offset.
variables.setVariable(offsetIndex, RexLiteral.intValue(offset));
  }
}
{code}
To tie these three phases together, {{CalcitePrepareImpl}} needs to setup the 
variables when it creates a {{PreparedResult}}:
{code:java}
try {
  CatalogReader.THREAD_LOCAL.set(catalogReader);
  final SqlConformance conformance = context.config().conformance();
  internalParameters.put("_conformance", conformance);
  // Get the compiled Bindable instance either from cache or generate a new one.
  bindable =