liyafan82 commented on a change in pull request #1832: [CALCITE-3827] Reduce
the time complexity of finding in-edges of a vertex in the graph
URL: https://github.com/apache/calcite/pull/1832#discussion_r384968735
##########
File path:
core/src/main/java/org/apache/calcite/util/graph/DefaultDirectedGraph.java
##########
@@ -120,45 +132,62 @@ public E getEdge(V source, V target) {
}
public boolean removeEdge(V source, V target) {
- final VertexInfo<V, E> info = vertexMap.get(source);
- List<E> outEdges = info.outEdges;
+ // remove out edges
+ final VertexInfo<V, E> outInfo = vertexOutMap.get(source);
+ List<E> outEdges = outInfo.edges;
+ boolean outRemoved = false;
for (int i = 0, size = outEdges.size(); i < size; i++) {
E edge = outEdges.get(i);
if (edge.target.equals(target)) {
outEdges.remove(i);
edges.remove(edge);
- return true;
+ outRemoved = true;
+ break;
}
}
- return false;
+
+ // remove in edges
+ final VertexInfo<V, E> inInfo = vertexInMap.get(target);
+ List<E> inEdges = inInfo.edges;
+ boolean inRemoved = false;
+ for (int i = 0, size = inEdges.size(); i < size; i++) {
+ E edge = inEdges.get(i);
+ if (edge.source.equals(source)) {
+ inEdges.remove(i);
+ inRemoved = true;
+ break;
+ }
+ }
+ assert outRemoved == inRemoved;
+ return outRemoved;
}
public Set<V> vertexSet() {
- return vertexMap.keySet();
+ return vertexOutMap.keySet();
}
public void removeAllVertices(Collection<V> collection) {
- vertexMap.keySet().removeAll(collection);
- for (VertexInfo<V, E> info : vertexMap.values()) {
+ // remove out edges
+ vertexOutMap.keySet().removeAll(collection);
+ for (VertexInfo<V, E> info : vertexOutMap.values()) {
//noinspection SuspiciousMethodCalls
- info.outEdges.removeIf(next -> collection.contains(next.target));
+ info.edges.removeIf(next -> collection.contains(next.target));
+ }
+
+ // remove in edges
+ vertexInMap.keySet().removeAll(collection);
+ for (VertexInfo<V, E> info : vertexInMap.values()) {
+ //noinspection SuspiciousMethodCalls
+ info.edges.removeIf(next -> collection.contains(next.source));
}
}
public List<E> getOutwardEdges(V source) {
- return vertexMap.get(source).outEdges;
+ return vertexOutMap.get(source).edges;
}
public List<E> getInwardEdges(V target) {
- final ArrayList<E> list = new ArrayList<>();
- for (VertexInfo<V, E> info : vertexMap.values()) {
- for (E edge : info.outEdges) {
- if (edge.target.equals(target)) {
- list.add(edge);
- }
- }
- }
- return list;
+ return vertexInMap.get(target).edges;
Review comment:
Good question.
We have performed some investigations/evaluations concerning the memory
consumption.
The results show that the memory consumption of the in-edge table is
neglectable, compared with the memory consumption of vertices in the graph.
In our evaluation, we use calcite to optimize a user sql query, after the
transformation by the HepPlanner, we measure the memory consumed by the graph
in the planner by (https://github.com/DimitrisAndreou/memory-measurer).
There are 14 vertices and 13 edges in the graph. The 14 vertices consumes
more than 40MB memory (there are some overlaps for the vertices, but it does
not affect our conclusion).
On the other hand, the memory consumption for an empty in-edge table is over
100 bytes, and whenever a vertex is added to the table, an additional 60 bytes
are consumed. So the total memory consumption of the in-edge table is no more
than 1KB, which is trivial for most environments.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services