taegeonum commented on a change in pull request #317:
URL: https://github.com/apache/incubator-nemo/pull/317#discussion_r703959364
##########
File path:
compiler/frontend/beam/src/main/java/org/apache/nemo/compiler/frontend/beam/transform/GBKTransform.java
##########
@@ -299,6 +300,12 @@ public final void emit(final WindowedValue<KV<K, OutputT>>
output) {
oc.emit(output);
}
+ /** Emit latencymark. */
+ @Override
Review comment:
Unnecessary codes because AbstractDoFnTransform has the same code block?
##########
File path:
runtime/common/src/main/java/org/apache/nemo/runtime/common/metric/LatencyMetric.java
##########
@@ -0,0 +1,61 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied. See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.nemo.runtime.common.metric;
+
+import org.apache.nemo.common.punctuation.Latencymark;
+
+import java.io.Serializable;
+
+/**
+ * Metric class for recording latencymark and the time when the latencymark is
recorded.
+ * The traversal time can be calculated by comparing the time when the
latencymark was created with the time recorded.
+ */
+public final class LatencyMetric implements Serializable {
+ private Latencymark latencymark;
+ private long timestamp;
Review comment:
Why not final?
##########
File path:
runtime/common/src/main/java/org/apache/nemo/runtime/common/metric/LatencyMetric.java
##########
@@ -0,0 +1,61 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied. See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.nemo.runtime.common.metric;
+
+import org.apache.nemo.common.punctuation.Latencymark;
+
+import java.io.Serializable;
+
+/**
+ * Metric class for recording latencymark and the time when the latencymark is
recorded.
+ * The traversal time can be calculated by comparing the time when the
latencymark was created with the time recorded.
+ */
+public final class LatencyMetric implements Serializable {
+ private Latencymark latencymark;
Review comment:
Why not final?
##########
File path:
common/src/main/java/org/apache/nemo/common/punctuation/Latencymark.java
##########
@@ -0,0 +1,93 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied. See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.nemo.common.punctuation;
+
+import java.io.Serializable;
+import java.util.Objects;
+
+/**
+ * Latency mark is conveyor that has data for debugging.
+ * It is created only from source vertex and record the timestamp when it is
created and taskId where it is created.
Review comment:
This contains two task id: createdTaskId and lastTaskId. I would like to
know why the two fields are required.
##########
File path:
runtime/common/src/main/java/org/apache/nemo/runtime/common/metric/TaskMetric.java
##########
@@ -106,6 +116,33 @@ private void setTaskDuration(final long taskDuration) {
this.taskDuration = taskDuration;
}
+ /**
+ * Method related to stream metric.
+ */
+ public final Map<String, List<StreamMetric>> getStreamMetric() {
+ return this.streamMetrics;
+ }
+
+ private void setStreamMetric(final Map<String, StreamMetric>
streamMetricMap) {
+ for (String sourceVertexId : streamMetricMap.keySet()) {
+ StreamMetric streamMetric = streamMetricMap.get(sourceVertexId);
+ this.streamMetrics.putIfAbsent(sourceVertexId, new LinkedList<>());
+ this.streamMetrics.get(sourceVertexId).add(streamMetric);
+ }
+ }
+
+ /**
+ * Method related to latency.
+ */
+ public final Map<String, List<LatencyMetric>> getLatencymarks() {
+ return this.latencymarks;
Review comment:
remove this
##########
File path:
common/src/main/java/org/apache/nemo/common/punctuation/Latencymark.java
##########
@@ -0,0 +1,93 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied. See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.nemo.common.punctuation;
+
+import java.io.Serializable;
+import java.util.Objects;
+
+/**
+ * Latency mark is conveyor that has data for debugging.
+ * It is created only from source vertex and record the timestamp when it is
created and taskId where it is created.
+ */
+public final class Latencymark implements Serializable {
+ private final String createdtaskId;
Review comment:
Where is it used?
##########
File path:
common/src/main/java/org/apache/nemo/common/punctuation/Latencymark.java
##########
@@ -0,0 +1,93 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied. See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.nemo.common.punctuation;
+
+import java.io.Serializable;
+import java.util.Objects;
+
+/**
+ * Latency mark is conveyor that has data for debugging.
+ * It is created only from source vertex and record the timestamp when it is
created and taskId where it is created.
+ */
+public final class Latencymark implements Serializable {
+ private final String createdtaskId;
+ private String lastTaskId;
+ private final long timestamp;
+
+ /**
+ * @param taskId task id where it is created
+ * @param timestamp timestamp when it is created
+ */
+ public Latencymark(final String taskId, final long timestamp) {
+ this.createdtaskId = taskId;
+ this.timestamp = timestamp;
+ this.lastTaskId = "";
+ }
+
+ /**
+ * @return the latencymark timestamp
+ */
+ public long getTimestamp() {
+ return timestamp;
+ }
+
+ /**
+ * @return the task id where it is created
+ */
+ public String getCreatedtaskId() {
+ return createdtaskId;
+ }
+
+
+ /**
+ * @return the task id where it is delivered from. task id of upstream task
Review comment:
Is it currentTask?
##########
File path:
common/src/main/java/org/apache/nemo/common/ir/vertex/transform/LatencymarkEmitTransform.java
##########
@@ -0,0 +1,57 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied. See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.nemo.common.ir.vertex.transform;
+
+import org.apache.nemo.common.ir.OutputCollector;
+import org.apache.nemo.common.punctuation.Latencymark;
+
+/**
+ * This transform does not emit watermarks.
Review comment:
Is this comment correct?
##########
File path:
runtime/common/src/main/java/org/apache/nemo/runtime/common/metric/TaskMetric.java
##########
@@ -106,6 +116,33 @@ private void setTaskDuration(final long taskDuration) {
this.taskDuration = taskDuration;
}
+ /**
+ * Method related to stream metric.
+ */
+ public final Map<String, List<StreamMetric>> getStreamMetric() {
+ return this.streamMetrics;
Review comment:
return streamMetrics
##########
File path:
runtime/common/src/main/java/org/apache/nemo/runtime/common/metric/LatencyMetric.java
##########
@@ -0,0 +1,61 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied. See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.nemo.runtime.common.metric;
+
+import org.apache.nemo.common.punctuation.Latencymark;
+
+import java.io.Serializable;
+
+/**
+ * Metric class for recording latencymark and the time when the latencymark is
recorded.
+ * The traversal time can be calculated by comparing the time when the
latencymark was created with the time recorded.
+ */
+public final class LatencyMetric implements Serializable {
+ private Latencymark latencymark;
+ private long timestamp;
+
+ /**
+ * Constructor with the latencymark and timestamp.
+ *
+ * @param latencymark the latencymark what task received.
Review comment:
?
##########
File path:
runtime/common/src/main/java/org/apache/nemo/runtime/common/metric/TaskMetric.java
##########
@@ -106,6 +116,33 @@ private void setTaskDuration(final long taskDuration) {
this.taskDuration = taskDuration;
}
+ /**
+ * Method related to stream metric.
+ */
+ public final Map<String, List<StreamMetric>> getStreamMetric() {
+ return this.streamMetrics;
+ }
+
+ private void setStreamMetric(final Map<String, StreamMetric>
streamMetricMap) {
+ for (String sourceVertexId : streamMetricMap.keySet()) {
+ StreamMetric streamMetric = streamMetricMap.get(sourceVertexId);
+ this.streamMetrics.putIfAbsent(sourceVertexId, new LinkedList<>());
Review comment:
remove this
##########
File path:
common/src/main/java/org/apache/nemo/common/punctuation/Latencymark.java
##########
@@ -0,0 +1,93 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied. See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.nemo.common.punctuation;
+
+import java.io.Serializable;
+import java.util.Objects;
+
+/**
+ * Latency mark is conveyor that has data for debugging.
+ * It is created only from source vertex and record the timestamp when it is
created and taskId where it is created.
+ */
+public final class Latencymark implements Serializable {
+ private final String createdtaskId;
+ private String lastTaskId;
+ private final long timestamp;
+
+ /**
+ * @param taskId task id where it is created
+ * @param timestamp timestamp when it is created
+ */
+ public Latencymark(final String taskId, final long timestamp) {
+ this.createdtaskId = taskId;
+ this.timestamp = timestamp;
+ this.lastTaskId = "";
+ }
+
+ /**
+ * @return the latencymark timestamp
+ */
+ public long getTimestamp() {
+ return timestamp;
+ }
+
+ /**
+ * @return the task id where it is created
+ */
+ public String getCreatedtaskId() {
Review comment:
Where is it used?
##########
File path:
common/src/main/java/org/apache/nemo/common/punctuation/Latencymark.java
##########
@@ -0,0 +1,93 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied. See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.nemo.common.punctuation;
+
+import java.io.Serializable;
+import java.util.Objects;
+
+/**
+ * Latency mark is conveyor that has data for debugging.
+ * It is created only from source vertex and record the timestamp when it is
created and taskId where it is created.
Review comment:
But, the comment says that `taskId` is the task where the event is
created. It is not lastTaskId. Is this comment incorrect?
##########
File path:
runtime/common/src/main/java/org/apache/nemo/runtime/common/metric/TaskMetric.java
##########
@@ -106,6 +116,33 @@ private void setTaskDuration(final long taskDuration) {
this.taskDuration = taskDuration;
}
+ /**
+ * Method related to stream metric.
+ */
+ public final Map<String, List<StreamMetric>> getStreamMetric() {
+ return this.streamMetrics;
+ }
+
+ private void setStreamMetric(final Map<String, StreamMetric>
streamMetricMap) {
+ for (String sourceVertexId : streamMetricMap.keySet()) {
+ StreamMetric streamMetric = streamMetricMap.get(sourceVertexId);
+ this.streamMetrics.putIfAbsent(sourceVertexId, new LinkedList<>());
+ this.streamMetrics.get(sourceVertexId).add(streamMetric);
Review comment:
remove this
##########
File path:
common/src/main/java/org/apache/nemo/common/punctuation/Latencymark.java
##########
@@ -0,0 +1,93 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied. See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.nemo.common.punctuation;
+
+import java.io.Serializable;
+import java.util.Objects;
+
+/**
+ * Latency mark is conveyor that has data for debugging.
+ * It is created only from source vertex and record the timestamp when it is
created and taskId where it is created.
+ */
+public final class Latencymark implements Serializable {
+ private final String createdtaskId;
+ private String lastTaskId;
+ private final long timestamp;
+
+ /**
+ * @param taskId task id where it is created
+ * @param timestamp timestamp when it is created
+ */
+ public Latencymark(final String taskId, final long timestamp) {
+ this.createdtaskId = taskId;
+ this.timestamp = timestamp;
+ this.lastTaskId = "";
+ }
+
+ /**
+ * @return the latencymark timestamp
+ */
+ public long getTimestamp() {
+ return timestamp;
+ }
+
+ /**
+ * @return the task id where it is created
+ */
+ public String getCreatedtaskId() {
+ return createdtaskId;
+ }
+
+
+ /**
+ * @return the task id where it is delivered from. task id of upstream task
Review comment:
The field name `lastTaskId` is confusing to me. Isn't it sink? What is
the last? Is it previousTask? or lastTask?
##########
File path:
runtime/executor/src/main/java/org/apache/nemo/runtime/executor/task/TaskExecutor.java
##########
@@ -123,9 +140,63 @@ public TaskExecutor(final Task task,
this.dataFetchers = pair.left();
this.sortedHarnesses = pair.right();
+ // initialize metrics
+ this.numOfReadTupleMap = new HashMap<>();
+ this.lastSerializedReadByteMap = new HashMap<>();
+ for (DataFetcher dataFetcher : dataFetchers) {
+ this.numOfReadTupleMap.put(dataFetcher.getDataSource().getId(), new
AtomicLong());
+ this.lastSerializedReadByteMap.put(dataFetcher.getDataSource().getId(),
0L);
+ }
+
+ // set the interval for recording stream metric
+ if (streamMetricRecordPeriod > 0) {
+ this.timeSinceLastRecordStreamMetric = System.currentTimeMillis();
+ this.periodicMetricService = Executors.newScheduledThreadPool(1);
+ this.periodicMetricService.scheduleAtFixedRate(
+ this::saveStreamMetric, 0, streamMetricRecordPeriod,
TimeUnit.MILLISECONDS);
+ }
this.timeSinceLastExecution = System.currentTimeMillis();
}
+ // Send stream metric to the runtime master
+ private void saveStreamMetric() {
Review comment:
Sending metrics in each task may lead to huge overheads if the number of
tasks is large. Can't we use separate threads for sending metrics? Maybe we
need another class for retrieving and sending task metrics.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]