[GitHub] spark pull request #21071: [SPARK-21962][CORE] Distributed Tracing in Spark

2018-04-22 Thread devaraj-kavali
Github user devaraj-kavali closed the pull request at:

https://github.com/apache/spark/pull/21071


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21071: [SPARK-21962][CORE] Distributed Tracing in Spark

2018-04-16 Thread steveloughran
Github user steveloughran commented on a diff in the pull request:

https://github.com/apache/spark/pull/21071#discussion_r181810726
  
--- Diff: core/src/main/scala/org/apache/spark/trace/SparkAppTracer.scala 
---
@@ -0,0 +1,41 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.trace
+
+import org.apache.htrace.core.{HTraceConfiguration, Tracer}
+
+import org.apache.spark.SparkConf
+
+object SparkAppTracer {
--- End diff --

best to mark as private [spark]


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21071: [SPARK-21962][CORE] Distributed Tracing in Spark

2018-04-13 Thread devaraj-kavali
GitHub user devaraj-kavali opened a pull request:

https://github.com/apache/spark/pull/21071

[SPARK-21962][CORE] Distributed Tracing in Spark

## What changes were proposed in this pull request?

This PR integrates with HTrace, it sends traces for the application and 
tasks when the span receivers are configured. The trace configurations can be 
updated along with spark configurations by adding prefix 'spark.htrace.' to the 
HTrace configurations like below,

`spark.htrace.span.receiver.classes`
org.apache.htrace.core.LocalFileSpanReceiver;org.apache.htrace.impl.HTracedSpanReceiver;org.apache.htrace.impl.ZipkinSpanReceiver
`spark.htrace.htraced.receiver.address` IP:PORT
`spark.htrace.local.file.span.receiver.path`/path/local-span-file
`spark.htrace.sampler.classes`  org.apache.htrace.core.AlwaysSampler

And also it provides an additional configuration to receive the parent span 
with the config name `spark.app.spanId`, if the `spark.app.spanId` 
configuration exist then it takes it as parent span, otherwise it starts a new 
span for each application.

## How was this patch tested?

I have verified using the existing tests with the added test and also 
verified manually in all these below deployment modes with different tracers 
individually and together.

1. Local and local-cluster
2. Standalone Client and Cluster modes
3. Yarn Client and Cluster modes
4. Mesos Client and Cluster modes

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/devaraj-kavali/spark SPARK-21962

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/21071.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #21071


commit 254e4ed38411d45cc8c2ba8cdace069da219c359
Author: Devaraj K 
Date:   2018-04-14T00:06:36Z

[SPARK-21962][CORE] Distributed Tracing in Spark




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org