[jira] [Commented] (SOLR-9641) Emit distributed tracing information from Solr

2016-10-24 Thread Mike Drob (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15603146#comment-15603146
 ] 

Mike Drob commented on SOLR-9641:
-

Which one is the "default"? I see {{./solr/example/exampledocs/solr.xml}} and 
{{./solr/server/solr/solr.xml}}


> Emit distributed tracing information from Solr
> --
>
> Key: SOLR-9641
> URL: https://issues.apache.org/jira/browse/SOLR-9641
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Mike Drob
> Fix For: master (7.0)
>
> Attachments: SOLR-9641.patch
>
>
> While Solr already offers a few tools for exposing timing, this information 
> can be difficult to aggregate and analyze. By integrating distributed tracing 
> into Solr operations, we can gain new performance and behaviour insights.
> One such solution can be accomplished via Apache HTrace (incubating).
> (More rationale to follow.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-9641) Emit distributed tracing information from Solr

2016-10-24 Thread Mike Drob (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15603141#comment-15603141
 ] 

Mike Drob commented on SOLR-9641:
-

bq. in CoreContainer there is one zkSys.getZkController().getNodeName() and one 
getZkController().getNodeName() call, they could be combined into one call with 
result kept in local variable or both could use or not use zkSys for clarity.
Done.
bq. In SearchHandler, how about also having trace scopes for the 
handleResponses and finishStage steps? Or if the intention is to only trace 
component methods which typically make requests to other shards maybe not trace 
the prepare step?
Hmm... yes, this could make sense. I didn't want to put too much in for the 
distributed request portion because that also gets traced on the remote peers. 
But you're right that something should be looked at here. Adding it around only 
handleResponse and finishStage seems insufficient? There is a lot of other 
things going on in the distribute branch there. Will come back to this later...
bq. In CoreAdminHandler for the callInfo.call(); there is the traceDescription 
+ " async" scope i.e. differentiation between sync and async. Just wondering if 
something similar might be useful for SearchHandler's without-debug and 
with-debug prepare and process scopes?
You mean labelling the debug scope with a debug description? Yea, that's 
doable. My async description was largely a hack, I think, and will probably go 
away in favor of something more generic.
bq. In the tests, curious why only [0] is being added in the getReceivers 
methods?
Because there was only one receiver configured per jetty. I'll change this to 
grab them all.
bq. In the tests, might the Random random() method be passed down to SpanId
Good idea. I'll make a utility method in Solr for now, but also filed HTRACE-391

> Emit distributed tracing information from Solr
> --
>
> Key: SOLR-9641
> URL: https://issues.apache.org/jira/browse/SOLR-9641
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Mike Drob
> Fix For: master (7.0)
>
> Attachments: SOLR-9641.patch
>
>
> While Solr already offers a few tools for exposing timing, this information 
> can be difficult to aggregate and analyze. By integrating distributed tracing 
> into Solr operations, we can gain new performance and behaviour insights.
> One such solution can be accomplished via Apache HTrace (incubating).
> (More rationale to follow.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-9641) Emit distributed tracing information from Solr

2016-10-22 Thread Christine Poerschke (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15598675#comment-15598675
 ] 

Christine Poerschke commented on SOLR-9641:
---

bq. This is really cool Mike Drob! ...

+1 to that, I am also looking forward to having tracing support in Solr.

Here's my comments from looking at the patch:

* minor: in CoreContainer there is one 
{{zkSys.getZkController().getNodeName()}} and one 
{{getZkController().getNodeName()}} call, they could be combined into one call 
with result kept in local variable or both could use or not use {{zkSys}} for 
clarity.

* In SearchHandler, how about also having trace scopes for the 
{{handleResponses}} and {{finishStage}} steps? Or if the intention is to only 
trace component methods which typically make requests to other shards maybe not 
trace the {{prepare}} step?

* In CoreAdminHandler for the {{callInfo.call();}} there is the 
{{traceDescription + " async"}} scope i.e. differentiation between sync and 
async. Just wondering if something similar might be useful for SearchHandler's 
without-debug and with-debug prepare and process scopes?

* In the tests, curious why only \[0\] is being added in the getReceivers 
methods?

* In the tests, might the {{Random random()}} method be passed down to SpanId 
i.e. for the tests
{code}
- ... SpanId.fromRandom() ...
+ ... SpanId.fromRandom(random()) ...
{code}
and for 
[SpanId.java|https://github.com/apache/incubator-htrace/blob/master/htrace-core4/src/main/java/org/apache/htrace/core/SpanId.java]
 something along the lines of
{code}
+ import java.util.Random;
+
+ private static long nonZeroRand64(Random random) {
+   while (true) {
+ long r = random.nextLong();
+ if (r != 0) {
+   return r;
+ }
+   }
+ }
+
+ public static SpanId fromRandom(Random random) {
+   return new SpanId(nonZeroRand64(random), nonZeroRand64(random));
+ }
{code}

> Emit distributed tracing information from Solr
> --
>
> Key: SOLR-9641
> URL: https://issues.apache.org/jira/browse/SOLR-9641
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Mike Drob
> Fix For: master (7.0)
>
> Attachments: SOLR-9641.patch
>
>
> While Solr already offers a few tools for exposing timing, this information 
> can be difficult to aggregate and analyze. By integrating distributed tracing 
> into Solr operations, we can gain new performance and behaviour insights.
> One such solution can be accomplished via Apache HTrace (incubating).
> (More rationale to follow.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-9641) Emit distributed tracing information from Solr

2016-10-20 Thread David Smiley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15592405#comment-15592405
 ] 

David Smiley commented on SOLR-9641:


Yes.  Perhaps the default solr.xml might have a commented trace section -- 
brief.

> Emit distributed tracing information from Solr
> --
>
> Key: SOLR-9641
> URL: https://issues.apache.org/jira/browse/SOLR-9641
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Mike Drob
> Fix For: master (7.0)
>
> Attachments: SOLR-9641.patch
>
>
> While Solr already offers a few tools for exposing timing, this information 
> can be difficult to aggregate and analyze. By integrating distributed tracing 
> into Solr operations, we can gain new performance and behaviour insights.
> One such solution can be accomplished via Apache HTrace (incubating).
> (More rationale to follow.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-9641) Emit distributed tracing information from Solr

2016-10-20 Thread Mike Drob (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15592319#comment-15592319
 ] 

Mike Drob commented on SOLR-9641:
-

Documenting what goes in the {{trace}} section in solr.xml would also be 
ref-guide, yes?

> Emit distributed tracing information from Solr
> --
>
> Key: SOLR-9641
> URL: https://issues.apache.org/jira/browse/SOLR-9641
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Mike Drob
> Fix For: master (7.0)
>
> Attachments: SOLR-9641.patch
>
>
> While Solr already offers a few tools for exposing timing, this information 
> can be difficult to aggregate and analyze. By integrating distributed tracing 
> into Solr operations, we can gain new performance and behaviour insights.
> One such solution can be accomplished via Apache HTrace (incubating).
> (More rationale to follow.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-9641) Emit distributed tracing information from Solr

2016-10-20 Thread David Smiley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15592025#comment-15592025
 ] 

David Smiley commented on SOLR-9641:


Docs:
* javadocs: probably on the Tracer field you added to core container. 
TracerUtils.java should refer to that so people know where it's placed in Solr.
* user docs: we'll probably want to add this to the ref guide... at least 
something very brief that can demonstrate the simplest useful way to see it in 
action, and then we refer users to other possibilities (i.e. ZipKin).  There 
ought to be a reference to this feature in the vicinity of where 
debugQuery/debug=timing is so people know of this more sophisticated option.

> Emit distributed tracing information from Solr
> --
>
> Key: SOLR-9641
> URL: https://issues.apache.org/jira/browse/SOLR-9641
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Mike Drob
> Fix For: master (7.0)
>
> Attachments: SOLR-9641.patch
>
>
> While Solr already offers a few tools for exposing timing, this information 
> can be difficult to aggregate and analyze. By integrating distributed tracing 
> into Solr operations, we can gain new performance and behaviour insights.
> One such solution can be accomplished via Apache HTrace (incubating).
> (More rationale to follow.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-9641) Emit distributed tracing information from Solr

2016-10-20 Thread David Smiley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15592003#comment-15592003
 ] 

David Smiley commented on SOLR-9641:


See HttpSolrCall.call around line 469 (writeResponse)

> Emit distributed tracing information from Solr
> --
>
> Key: SOLR-9641
> URL: https://issues.apache.org/jira/browse/SOLR-9641
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Mike Drob
> Fix For: master (7.0)
>
> Attachments: SOLR-9641.patch
>
>
> While Solr already offers a few tools for exposing timing, this information 
> can be difficult to aggregate and analyze. By integrating distributed tracing 
> into Solr operations, we can gain new performance and behaviour insights.
> One such solution can be accomplished via Apache HTrace (incubating).
> (More rationale to follow.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-9641) Emit distributed tracing information from Solr

2016-10-20 Thread Mike Drob (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15591981#comment-15591981
 ] 

Mike Drob commented on SOLR-9641:
-

[~tomasflobbe] - we were talking last week about adding a trace around the 
response writer, but I'm struggling to find where that logic is. Can you give 
me a pointer?

> Emit distributed tracing information from Solr
> --
>
> Key: SOLR-9641
> URL: https://issues.apache.org/jira/browse/SOLR-9641
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Mike Drob
> Fix For: master (7.0)
>
> Attachments: SOLR-9641.patch
>
>
> While Solr already offers a few tools for exposing timing, this information 
> can be difficult to aggregate and analyze. By integrating distributed tracing 
> into Solr operations, we can gain new performance and behaviour insights.
> One such solution can be accomplished via Apache HTrace (incubating).
> (More rationale to follow.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-9641) Emit distributed tracing information from Solr

2016-10-20 Thread Mike Drob (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15591976#comment-15591976
 ] 

Mike Drob commented on SOLR-9641:
-

Thanks for taking a look, [~dsmiley]!

bq. Can you recommend a tool that can be used with Solr after this patch is 
applied to visualize or otherwise make use of it to help us analyze Solr 
performance?
The built in HTrace viewer is reasonable for some purposes, but probably not 
ideal for all purposes. There is also a Zipkin bridge, so you could use that as 
your visualizer. Both are configured by setting the {{span.receiver.classes}} 
configuration to the appropriate value.

My docs are pretty sparse at the moment, where would you suggest placing them? 
We can have a short description and then refer to the full HTrace docs for 
completeness.

{quote}
* SolrCore.newScope: guard log.debug with log.isDebugEnabled to avoid toString
* HttpShardHandler: maybe instead of always wrapping task with traceTask we 
instead conditionally replace task with a tracing one? This way we conveniently 
avoid the wrapping if there is no tracing.
* CommonParams.java:TRACE_ID: a one-liner comment referencing "HTrace" would be 
useful.
{quote}
Done. I'm not going to upload a new patch yet, since the changes are relatively 
minimal and I don't want to clutter the issue.

{quote}
* loadTraceConfig: could you use NamedList.asMap(1) or perhaps not because 
"String" type?
{quote}
I tried this and it worked, but something about it feels incredibly fragile. 
I'll leave it in for now, however.

{quote}
* TracerUtils: I like this. Question: should newScope(SolrQueryRequest request, 
String description) also look in the request params to see if there is a 
parent, and if so conditionally call tracer.newScope with that parent?
{quote}
Hmm, maybe. I know that it is possible to have multiple parents per span, but I 
think the APIs around it are a little clunky. Will need to think on this more.

Actually, no. I don't think we need to pull the parent from the request params 
here, since we already do that in {{SolrCore.newScope}}, which should be 
handling most things. The method in {{TracerUtils}} is more of a convenience 
thing to get at the core container so we can get the tracer.

> Emit distributed tracing information from Solr
> --
>
> Key: SOLR-9641
> URL: https://issues.apache.org/jira/browse/SOLR-9641
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Mike Drob
> Fix For: master (7.0)
>
> Attachments: SOLR-9641.patch
>
>
> While Solr already offers a few tools for exposing timing, this information 
> can be difficult to aggregate and analyze. By integrating distributed tracing 
> into Solr operations, we can gain new performance and behaviour insights.
> One such solution can be accomplished via Apache HTrace (incubating).
> (More rationale to follow.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-9641) Emit distributed tracing information from Solr

2016-10-19 Thread David Smiley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15588732#comment-15588732
 ] 

David Smiley commented on SOLR-9641:


This is really cool [~mdrob]!  I learned about tracing at Apache Big Data this 
year and I became hopeful that one day Solr would get tracing abilities.  Can 
you recommend a tool that can be used with Solr after this patch is applied to 
visualize or otherwise make use of it to help us analyze Solr performance?

I looked at the patch; the approach is overall quite nice I think.  Some 
comments:
* SolrCore.newScope: guard log.debug with log.isDebugEnabled to avoid toString 
* loadTraceConfig: could you use NamedList.asMap(1)  or perhaps not because 
"String" type?
* TracerUtils: I like this. Question: should newScope(SolrQueryRequest request, 
String description) also look in the request params to see if there is a 
parent, and if so conditionally call tracer.newScope with that parent?
* HttpShardHandler: maybe instead of always wrapping task with traceTask we 
instead conditionally replace task with a tracing one?  This way we 
conveniently avoid the wrapping if there is no tracing.
* CommonParams.java:TRACE_ID: a one-liner comment referencing "HTrace" would be 
useful.

> Emit distributed tracing information from Solr
> --
>
> Key: SOLR-9641
> URL: https://issues.apache.org/jira/browse/SOLR-9641
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Mike Drob
> Fix For: master (7.0)
>
> Attachments: SOLR-9641.patch
>
>
> While Solr already offers a few tools for exposing timing, this information 
> can be difficult to aggregate and analyze. By integrating distributed tracing 
> into Solr operations, we can gain new performance and behaviour insights.
> One such solution can be accomplished via Apache HTrace (incubating).
> (More rationale to follow.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org