[
https://issues.apache.org/jira/browse/CALCITE-4049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17132852#comment-17132852
]
Liya Fan edited comment on CALCITE-4049 at 6/11/20, 2:54 AM:
-------------------------------------------------------------
[~julianhyde] Thanks a lot for your comments.
The down-side of adding the size field is that it makes the object size 4 bytes
larger (I am not sure if it is 50% larger, as an object has other meta-data to
store). The up-side is reducing the time complexity of getting size from O( n )
to O(1).
IMO, a general trend for performance optimization is to reduce time complexity
by taking more spaces, as memory/disk spaces are becoming less expensive today
and modern computers tend to have larger memory/disk spaces.
For example, we build indices for relation databases to reduce the query time,
at the expense of extra memory/disk spaces. We can also find examples from
Calcite code base. For example, in class
[RexCall|https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/rex/RexCall.java],
we have the {{nodeCount}} field, which introduces an extra 4 byte. However,
since it reduces the time complexity of getting node count, it should be
considered a good optimization.
was (Author: fan_li_ya):
[~julianhyde] Thanks a lot for your comments.
The down-side of adding the size field is that it makes the object size 4 bytes
larger (I am not sure if it is 50% larger, as an object has other meta-data to
store). The up-side is reducing the time complexity of getting size from O(n)
to O(1).
IMO, a general trend for performance optimization is to reduce time complexity
by taking more spaces, as memory/disk spaces are becoming less expensive today
and modern computers tend to have larger memory/disk spaces.
For example, we build indices for relation databases to reduce the query time,
at the expense of extra memory/disk spaces. We can also find examples from
Calcite code base. For example, in class
[RexCall|https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/rex/RexCall.java],
we have the {{nodeCount}} field, which introduces an extra 4 byte. However,
since it reduces the time complexity of getting node count, it should be
considered a good optimization.
> Reduce the time complexity of getting shortest distances
> --------------------------------------------------------
>
> Key: CALCITE-4049
> URL: https://issues.apache.org/jira/browse/CALCITE-4049
> Project: Calcite
> Issue Type: Improvement
> Components: core
> Reporter: Liya Fan
> Assignee: Liya Fan
> Priority: Major
> Fix For: 1.24.0
>
> Time Spent: 1h 20m
> Remaining Estimate: 0h
>
> Currently, we have {{Graphs#makeImmutable}} to compute the shortest paths
> between all pairs of nodes. For many scenarios, however, we do not need the
> exact paths between nodes. Instead, we are only interested in the lengths of
> the shortest paths.
> To get the path length, we need to get the shortest path first, which is
> returned as a {{List}}, then we call the {{List#size()}} method. According to
> the current implementation, the returned list is of type {{ConsList}}. The
> time complexity of {{ConsList#size}} is O(p) (p is the number of vertices on
> the path), which is inefficient.
> In this issue, we revise the implementation of {{ConsList#size}} so that it
> takes O(1) time. In addition, we also give a utiltiy to get the shortest
> distances between nodes.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)