[
https://issues.apache.org/jira/browse/CALCITE-3878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17069217#comment-17069217
]
neoremind edited comment on CALCITE-3878 at 3/28/20, 8:00 AM:
--------------------------------------------------------------
I didn't expect to bring so deep discussion on this. Sorry to bring so much
trouble to you guys, we should more focus on Calcite itself.
[~julianhyde] , I agree that for high level language, code clarity matters a
lot. The example you mentioned:
{code:java}
@Override public RelNode convert(RelNode rel) {
LogicalJoin join = (LogicalJoin) rel;
List<RelNode> newInputs = new ArrayList<>(join.getInputs().size());
for (RelNode input : join.getInputs()) {
{code}
Indeed, the wasted space is not a big deal.Like [~vladimirsitnikov] pointed
out, in cases like with so many selected expressions, might be a good reason.
So the debate make me think what shall we do in the future as a principle?
Allow new ArrayList with capacity or not?
I search on existing code, there are already mixed usages. For example,
1) Specify constant capacity.
{code:java}
JdbcTable:
public SqlIdentifier tableName() {
final List<String> names = new ArrayList<>(3);
if (jdbcSchema.catalog != null) {
names.add(jdbcSchema.catalog);
}
if (jdbcSchema.schema != null) {
names.add(jdbcSchema.schema);
}
names.add(jdbcTableName);
return new SqlIdentifier(names, SqlParserPos.ZERO);
}
{code}
2) Specify 0 capacity.
{code:java}
RexProgram:
for (RelCollation collation : inputCollations) {
final List<RelFieldCollation> fieldCollations = new ArrayList<>(0);
for (RelFieldCollation fieldCollation : collation.getFieldCollations()) {
final int source = fieldCollation.getFieldIndex();
final int target = targets[source];
if (target < 0) {
continue loop;
}
fieldCollations.add(fieldCollation.withFieldIndex(target));
}
{code}
3) Specify capacity with exact size.
{code:java}
HepPlanner.java:
final List<RelNode> children = new ArrayList<>(childRels.size());
for (HepRelVertex childRel : childRels) {
children.add(childRel.getCurrentRel());
}
{code}
If we leave it as what happens now, not care too much, then I hope this issue
could be help people think of performance as well as clean code in the same
time. (Just row I notice people use
_description.matches("[A-Za-z][-A-Za-z0-9_.(),\\[\\]
s:]*")_ in rule constructor, which should be make the +Patter.complie+ out :)
sorry for nitpicking).
was (Author: neoremind):
I didn't expect to bring so deep discussion on this. Sorry to bring so much
trouble to you guys, we should more focus on Calcite itself.
[~julianhyde] , I agree that for high level language, code clarity matters a
lot. The example you mentioned:
{code:java}
@Override public RelNode convert(RelNode rel) {
LogicalJoin join = (LogicalJoin) rel;
List<RelNode> newInputs = new ArrayList<>(join.getInputs().size());
for (RelNode input : join.getInputs()) {
{code}
Indeed, the wasted space is not a big deal.Like [~vladimirsitnikov] pointed
out, in cases like with so many selected expressions, might be a good reason.
So the debate make me think what shall we do in the future as a principle?
Allow new ArrayList with capacity or not?
I search on existing code, there are already mixed usages. For example,
1) Specify constant capacity.
{code:java}
JdbcTable:
public SqlIdentifier tableName() {
final List<String> names = new ArrayList<>(3);
if (jdbcSchema.catalog != null) {
names.add(jdbcSchema.catalog);
}
if (jdbcSchema.schema != null) {
names.add(jdbcSchema.schema);
}
names.add(jdbcTableName);
return new SqlIdentifier(names, SqlParserPos.ZERO);
}
{code}
2) Specify 0 capacity.
{code:java}
RexProgram:
for (RelCollation collation : inputCollations) {
final List<RelFieldCollation> fieldCollations = new ArrayList<>(0);
for (RelFieldCollation fieldCollation : collation.getFieldCollations()) {
final int source = fieldCollation.getFieldIndex();
final int target = targets[source];
if (target < 0) {
continue loop;
}
fieldCollations.add(fieldCollation.withFieldIndex(target));
}
{code}
3) Specify capacity with exact size.
{code:java}
HepPlanner.java:
final List<RelNode> children = new ArrayList<>(childRels.size());
for (HepRelVertex childRel : childRels) {
children.add(childRel.getCurrentRel());
}
{code}
f we leave it as what happens now, not care too much, then I hope this issue
could be help people think of performance as well as clean code in the same
time. (Just row I notice people use
_description.matches("[A-Za-z][-A-Za-z0-9_.(),\\[\\]\\s:]*")_ in rule
constructor, which should be make the +Patter.complie+ out :) sorry for
nitpicking).
> Make ArrayList creation with initial capacity when size is fixed
> ----------------------------------------------------------------
>
> Key: CALCITE-3878
> URL: https://issues.apache.org/jira/browse/CALCITE-3878
> Project: Calcite
> Issue Type: Improvement
> Components: core
> Affects Versions: 1.22.0
> Reporter: neoremind
> Priority: Minor
> Labels: pull-request-available
> Time Spent: 1h 10m
> Remaining Estimate: 0h
>
> I find many places in Calcite where _new ArrayList<>()_ is used, if the list
> is expected to be immutable or not resizing, it is always a good manner to
> create with initial capacity, better for memory usage and performance.
> I search all occurrences, focus on the core module, to make it safe, I only
> update local variables with fixed size and not working in recursive method.
> If the local variable reference goes out of scope, if resizing is needed,
> things will work normally as well, so no side effect, but for the "escaping"
> case, I am very conservative and do not change them.
>
>
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)