Re: Distributed tracing for Solr via adding HTTP headers?
I have had this exact same use case and we ended up just setting a header value, then in a Servlet Filter we read the header value and set the MDC property within the filter. By reading the header value it didn’t complain about reading the request before making it to the SolrDispatchFilter. We used the Jetty web defaults to jam this functionality at the beginning of the servlet processing chain without having to crack open the war. -Steve On Apr 7, 2014, at 8:01 PM, Michael Sokolov wrote: > Yes, I see. SolrDispatchFilter is - not really written with extensibility > in mind. > > -Mike > > On 4/7/14 3:50 PM, Gregg Donovan wrote: >> Michael, >> >> Thanks! Unfortunately, as we use POSTs, that approach would trigger the >> getParameterIncompatibilityException call due to the Enumeration of >> getParameterNames before SolrDispatchFilter has a chance to access the >> InputStream. >> >> I opened https://issues.apache.org/jira/browse/SOLR-5969 to discuss further >> and attached our current patch. >> >> >> On Mon, Apr 7, 2014 at 2:02 PM, Michael Sokolov < >> msoko...@safaribooksonline.com> wrote: >> >>> I had to grapple with something like this problem when I wrote Lux's >>> app-server. I extended SolrDispatchFilter and handle parameter swizzling >>> to keep everything nicey-nicey for Solr while being able to play games with >>> parameters of my own. Perhaps this will give you some ideas: >>> >>> https://github.com/msokolov/lux/blob/master/src/main/java/ >>> lux/solr/LuxDispatchFilter.java >>> >>> It's definitely hackish, but seems to get the job done - for me - it's not >>> a reusable component, but might serve as an illustration of one way to >>> handle the problem >>> >>> -Mike >>> >>> >>> On 04/07/2014 12:23 PM, Gregg Donovan wrote: >>> That was my first attempt, but it's much trickier than I anticipated. A filter that calls HttpServletRequest#getParameter() before SolrDispatchFilter will trigger an exception -- see getParameterIncompatibilityException [1] -- if the request is a POST. It seems that Solr depends on the configured per-core SolrRequestParser to properly parse the request parameters. A servlet filter that came before SolrDispatchFilter would need to fetch the correct SolrRequestParser for the requested core, parse the request, and reset the InputStream before pulling the data into the MDC. It also duplicates the work of request parsing. It's especially tricky if you want to remove the tracing parameters from the SolrParams and just have them in the MDC to avoid them being logged twice. [1] https://github.com/apache/lucene-solr/blob/trunk/solr/ core/src/java/org/apache/solr/servlet/SolrRequestParsers.java#L621:L628 On Sun, Apr 6, 2014 at 2:20 PM, Alexandre Rafalovitch wrote: On the second thought, > If you are already managing to pass the value using the request > parameters, what stops you from just having a servlet filter looking > for that parameter and assigning it directly to the MDC context? > > Regards, > Alex. > Personal website: http://www.outerthoughts.com/ > Current project: http://www.solr-start.com/ - Accelerating your Solr > proficiency > > > On Sat, Apr 5, 2014 at 7:45 AM, Alexandre Rafalovitch > wrote: > >> I like the idea. No comments about implementation, leave it to others. >> >> But if it is done, maybe somebody very familiar with logging can also >> review Solr's current logging config. I suspect it is not optimized >> for troubleshooting at this point. >> >> Regards, >> Alex. >> Personal website: http://www.outerthoughts.com/ >> Current project: http://www.solr-start.com/ - Accelerating your Solr >> > proficiency > >> On Sat, Apr 5, 2014 at 3:16 AM, Gregg Donovan >> > wrote: > >> We have some metadata -- e.g. a request UUID -- that we log to every log >>> line using Log4J's MDC [1]. The UUID logging allows us to connect any >>> >> log >> lines we have for a given request across servers. Sort of like Zipkin >> [2]. >> Currently we're using EmbeddedSolrServer without sharding, so adding the >>> UUID is fairly simple, since everything is in one process and one >>> >> thread. >> But, we're testing a sharded HTTP implementation and running into some >>> difficulties getting this data passed around in a way that lets us >>> trace >>> all log lines generated by a request to its UUID. >>> >>> >
Re: Distributed tracing for Solr via adding HTTP headers?
Yes, I see. SolrDispatchFilter is - not really written with extensibility in mind. -Mike On 4/7/14 3:50 PM, Gregg Donovan wrote: Michael, Thanks! Unfortunately, as we use POSTs, that approach would trigger the getParameterIncompatibilityException call due to the Enumeration of getParameterNames before SolrDispatchFilter has a chance to access the InputStream. I opened https://issues.apache.org/jira/browse/SOLR-5969 to discuss further and attached our current patch. On Mon, Apr 7, 2014 at 2:02 PM, Michael Sokolov < msoko...@safaribooksonline.com> wrote: I had to grapple with something like this problem when I wrote Lux's app-server. I extended SolrDispatchFilter and handle parameter swizzling to keep everything nicey-nicey for Solr while being able to play games with parameters of my own. Perhaps this will give you some ideas: https://github.com/msokolov/lux/blob/master/src/main/java/ lux/solr/LuxDispatchFilter.java It's definitely hackish, but seems to get the job done - for me - it's not a reusable component, but might serve as an illustration of one way to handle the problem -Mike On 04/07/2014 12:23 PM, Gregg Donovan wrote: That was my first attempt, but it's much trickier than I anticipated. A filter that calls HttpServletRequest#getParameter() before SolrDispatchFilter will trigger an exception -- see getParameterIncompatibilityException [1] -- if the request is a POST. It seems that Solr depends on the configured per-core SolrRequestParser to properly parse the request parameters. A servlet filter that came before SolrDispatchFilter would need to fetch the correct SolrRequestParser for the requested core, parse the request, and reset the InputStream before pulling the data into the MDC. It also duplicates the work of request parsing. It's especially tricky if you want to remove the tracing parameters from the SolrParams and just have them in the MDC to avoid them being logged twice. [1] https://github.com/apache/lucene-solr/blob/trunk/solr/ core/src/java/org/apache/solr/servlet/SolrRequestParsers.java#L621:L628 On Sun, Apr 6, 2014 at 2:20 PM, Alexandre Rafalovitch wrote: On the second thought, If you are already managing to pass the value using the request parameters, what stops you from just having a servlet filter looking for that parameter and assigning it directly to the MDC context? Regards, Alex. Personal website: http://www.outerthoughts.com/ Current project: http://www.solr-start.com/ - Accelerating your Solr proficiency On Sat, Apr 5, 2014 at 7:45 AM, Alexandre Rafalovitch wrote: I like the idea. No comments about implementation, leave it to others. But if it is done, maybe somebody very familiar with logging can also review Solr's current logging config. I suspect it is not optimized for troubleshooting at this point. Regards, Alex. Personal website: http://www.outerthoughts.com/ Current project: http://www.solr-start.com/ - Accelerating your Solr proficiency On Sat, Apr 5, 2014 at 3:16 AM, Gregg Donovan wrote: We have some metadata -- e.g. a request UUID -- that we log to every log line using Log4J's MDC [1]. The UUID logging allows us to connect any log lines we have for a given request across servers. Sort of like Zipkin [2]. Currently we're using EmbeddedSolrServer without sharding, so adding the UUID is fairly simple, since everything is in one process and one thread. But, we're testing a sharded HTTP implementation and running into some difficulties getting this data passed around in a way that lets us trace all log lines generated by a request to its UUID.
Re: Distributed tracing for Solr via adding HTTP headers?
Michael, Thanks! Unfortunately, as we use POSTs, that approach would trigger the getParameterIncompatibilityException call due to the Enumeration of getParameterNames before SolrDispatchFilter has a chance to access the InputStream. I opened https://issues.apache.org/jira/browse/SOLR-5969 to discuss further and attached our current patch. On Mon, Apr 7, 2014 at 2:02 PM, Michael Sokolov < msoko...@safaribooksonline.com> wrote: > I had to grapple with something like this problem when I wrote Lux's > app-server. I extended SolrDispatchFilter and handle parameter swizzling > to keep everything nicey-nicey for Solr while being able to play games with > parameters of my own. Perhaps this will give you some ideas: > > https://github.com/msokolov/lux/blob/master/src/main/java/ > lux/solr/LuxDispatchFilter.java > > It's definitely hackish, but seems to get the job done - for me - it's not > a reusable component, but might serve as an illustration of one way to > handle the problem > > -Mike > > > On 04/07/2014 12:23 PM, Gregg Donovan wrote: > >> That was my first attempt, but it's much trickier than I anticipated. >> >> A filter that calls HttpServletRequest#getParameter() before >> SolrDispatchFilter will trigger an exception -- see >> getParameterIncompatibilityException [1] -- if the request is a POST. It >> seems that Solr depends on the configured per-core SolrRequestParser to >> properly parse the request parameters. A servlet filter that came before >> SolrDispatchFilter would need to fetch the correct SolrRequestParser for >> the requested core, parse the request, and reset the InputStream before >> pulling the data into the MDC. It also duplicates the work of request >> parsing. It's especially tricky if you want to remove the tracing >> parameters from the SolrParams and just have them in the MDC to avoid them >> being logged twice. >> >> >> [1] >> https://github.com/apache/lucene-solr/blob/trunk/solr/ >> core/src/java/org/apache/solr/servlet/SolrRequestParsers.java#L621:L628 >> >> >> On Sun, Apr 6, 2014 at 2:20 PM, Alexandre Rafalovitch > >wrote: >> >> On the second thought, >>> >>> If you are already managing to pass the value using the request >>> parameters, what stops you from just having a servlet filter looking >>> for that parameter and assigning it directly to the MDC context? >>> >>> Regards, >>> Alex. >>> Personal website: http://www.outerthoughts.com/ >>> Current project: http://www.solr-start.com/ - Accelerating your Solr >>> proficiency >>> >>> >>> On Sat, Apr 5, 2014 at 7:45 AM, Alexandre Rafalovitch >>> wrote: >>> I like the idea. No comments about implementation, leave it to others. But if it is done, maybe somebody very familiar with logging can also review Solr's current logging config. I suspect it is not optimized for troubleshooting at this point. Regards, Alex. Personal website: http://www.outerthoughts.com/ Current project: http://www.solr-start.com/ - Accelerating your Solr >>> proficiency >>> On Sat, Apr 5, 2014 at 3:16 AM, Gregg Donovan >>> wrote: >>> We have some metadata -- e.g. a request UUID -- that we log to every log > line using Log4J's MDC [1]. The UUID logging allows us to connect any > log >>> lines we have for a given request across servers. Sort of like Zipkin > [2]. >>> Currently we're using EmbeddedSolrServer without sharding, so adding the > UUID is fairly simple, since everything is in one process and one > thread. >>> But, we're testing a sharded HTTP implementation and running into some > difficulties getting this data passed around in a way that lets us > trace > all log lines generated by a request to its UUID. > > >
Re: Distributed tracing for Solr via adding HTTP headers?
I had to grapple with something like this problem when I wrote Lux's app-server. I extended SolrDispatchFilter and handle parameter swizzling to keep everything nicey-nicey for Solr while being able to play games with parameters of my own. Perhaps this will give you some ideas: https://github.com/msokolov/lux/blob/master/src/main/java/lux/solr/LuxDispatchFilter.java It's definitely hackish, but seems to get the job done - for me - it's not a reusable component, but might serve as an illustration of one way to handle the problem -Mike On 04/07/2014 12:23 PM, Gregg Donovan wrote: That was my first attempt, but it's much trickier than I anticipated. A filter that calls HttpServletRequest#getParameter() before SolrDispatchFilter will trigger an exception -- see getParameterIncompatibilityException [1] -- if the request is a POST. It seems that Solr depends on the configured per-core SolrRequestParser to properly parse the request parameters. A servlet filter that came before SolrDispatchFilter would need to fetch the correct SolrRequestParser for the requested core, parse the request, and reset the InputStream before pulling the data into the MDC. It also duplicates the work of request parsing. It's especially tricky if you want to remove the tracing parameters from the SolrParams and just have them in the MDC to avoid them being logged twice. [1] https://github.com/apache/lucene-solr/blob/trunk/solr/core/src/java/org/apache/solr/servlet/SolrRequestParsers.java#L621:L628 On Sun, Apr 6, 2014 at 2:20 PM, Alexandre Rafalovitch wrote: On the second thought, If you are already managing to pass the value using the request parameters, what stops you from just having a servlet filter looking for that parameter and assigning it directly to the MDC context? Regards, Alex. Personal website: http://www.outerthoughts.com/ Current project: http://www.solr-start.com/ - Accelerating your Solr proficiency On Sat, Apr 5, 2014 at 7:45 AM, Alexandre Rafalovitch wrote: I like the idea. No comments about implementation, leave it to others. But if it is done, maybe somebody very familiar with logging can also review Solr's current logging config. I suspect it is not optimized for troubleshooting at this point. Regards, Alex. Personal website: http://www.outerthoughts.com/ Current project: http://www.solr-start.com/ - Accelerating your Solr proficiency On Sat, Apr 5, 2014 at 3:16 AM, Gregg Donovan wrote: We have some metadata -- e.g. a request UUID -- that we log to every log line using Log4J's MDC [1]. The UUID logging allows us to connect any log lines we have for a given request across servers. Sort of like Zipkin [2]. Currently we're using EmbeddedSolrServer without sharding, so adding the UUID is fairly simple, since everything is in one process and one thread. But, we're testing a sharded HTTP implementation and running into some difficulties getting this data passed around in a way that lets us trace all log lines generated by a request to its UUID.
Re: Distributed tracing for Solr via adding HTTP headers?
So to rephrase: Solr will barf at unknown parameters, so we cannot currently send them in band. And the out of band dies not work due to post body handling complexity. You are proposing effectively a dynamic set with common prefix to stop the complaints. Plus the code to propagate those params. Is that a good general description? I am just wondering if this can be matched to some other real issues as well. Regards, Alex On 07/04/2014 11:23 pm, "Gregg Donovan" wrote: > That was my first attempt, but it's much trickier than I anticipated. > > A filter that calls HttpServletRequest#getParameter() before > SolrDispatchFilter will trigger an exception -- see > getParameterIncompatibilityException [1] -- if the request is a POST. It > seems that Solr depends on the configured per-core SolrRequestParser to > properly parse the request parameters. A servlet filter that came before > SolrDispatchFilter would need to fetch the correct SolrRequestParser for > the requested core, parse the request, and reset the InputStream before > pulling the data into the MDC. It also duplicates the work of request > parsing. It's especially tricky if you want to remove the tracing > parameters from the SolrParams and just have them in the MDC to avoid them > being logged twice. > > > [1] > > https://github.com/apache/lucene-solr/blob/trunk/solr/core/src/java/org/apache/solr/servlet/SolrRequestParsers.java#L621:L628 > > > On Sun, Apr 6, 2014 at 2:20 PM, Alexandre Rafalovitch >wrote: > > > On the second thought, > > > > If you are already managing to pass the value using the request > > parameters, what stops you from just having a servlet filter looking > > for that parameter and assigning it directly to the MDC context? > > > > Regards, > >Alex. > > Personal website: http://www.outerthoughts.com/ > > Current project: http://www.solr-start.com/ - Accelerating your Solr > > proficiency > > > > > > On Sat, Apr 5, 2014 at 7:45 AM, Alexandre Rafalovitch > > wrote: > > > I like the idea. No comments about implementation, leave it to others. > > > > > > But if it is done, maybe somebody very familiar with logging can also > > > review Solr's current logging config. I suspect it is not optimized > > > for troubleshooting at this point. > > > > > > Regards, > > >Alex. > > > Personal website: http://www.outerthoughts.com/ > > > Current project: http://www.solr-start.com/ - Accelerating your Solr > > proficiency > > > > > > > > > On Sat, Apr 5, 2014 at 3:16 AM, Gregg Donovan > > wrote: > > >> We have some metadata -- e.g. a request UUID -- that we log to every > log > > >> line using Log4J's MDC [1]. The UUID logging allows us to connect any > > log > > >> lines we have for a given request across servers. Sort of like Zipkin > > [2]. > > >> > > >> Currently we're using EmbeddedSolrServer without sharding, so adding > the > > >> UUID is fairly simple, since everything is in one process and one > > thread. > > >> But, we're testing a sharded HTTP implementation and running into some > > >> difficulties getting this data passed around in a way that lets us > trace > > >> all log lines generated by a request to its UUID. > > >> > > >
Re: Distributed tracing for Solr via adding HTTP headers?
That was my first attempt, but it's much trickier than I anticipated. A filter that calls HttpServletRequest#getParameter() before SolrDispatchFilter will trigger an exception -- see getParameterIncompatibilityException [1] -- if the request is a POST. It seems that Solr depends on the configured per-core SolrRequestParser to properly parse the request parameters. A servlet filter that came before SolrDispatchFilter would need to fetch the correct SolrRequestParser for the requested core, parse the request, and reset the InputStream before pulling the data into the MDC. It also duplicates the work of request parsing. It's especially tricky if you want to remove the tracing parameters from the SolrParams and just have them in the MDC to avoid them being logged twice. [1] https://github.com/apache/lucene-solr/blob/trunk/solr/core/src/java/org/apache/solr/servlet/SolrRequestParsers.java#L621:L628 On Sun, Apr 6, 2014 at 2:20 PM, Alexandre Rafalovitch wrote: > On the second thought, > > If you are already managing to pass the value using the request > parameters, what stops you from just having a servlet filter looking > for that parameter and assigning it directly to the MDC context? > > Regards, >Alex. > Personal website: http://www.outerthoughts.com/ > Current project: http://www.solr-start.com/ - Accelerating your Solr > proficiency > > > On Sat, Apr 5, 2014 at 7:45 AM, Alexandre Rafalovitch > wrote: > > I like the idea. No comments about implementation, leave it to others. > > > > But if it is done, maybe somebody very familiar with logging can also > > review Solr's current logging config. I suspect it is not optimized > > for troubleshooting at this point. > > > > Regards, > >Alex. > > Personal website: http://www.outerthoughts.com/ > > Current project: http://www.solr-start.com/ - Accelerating your Solr > proficiency > > > > > > On Sat, Apr 5, 2014 at 3:16 AM, Gregg Donovan > wrote: > >> We have some metadata -- e.g. a request UUID -- that we log to every log > >> line using Log4J's MDC [1]. The UUID logging allows us to connect any > log > >> lines we have for a given request across servers. Sort of like Zipkin > [2]. > >> > >> Currently we're using EmbeddedSolrServer without sharding, so adding the > >> UUID is fairly simple, since everything is in one process and one > thread. > >> But, we're testing a sharded HTTP implementation and running into some > >> difficulties getting this data passed around in a way that lets us trace > >> all log lines generated by a request to its UUID. > >> >
Re: Distributed tracing for Solr via adding HTTP headers?
On the second thought, If you are already managing to pass the value using the request parameters, what stops you from just having a servlet filter looking for that parameter and assigning it directly to the MDC context? Regards, Alex. Personal website: http://www.outerthoughts.com/ Current project: http://www.solr-start.com/ - Accelerating your Solr proficiency On Sat, Apr 5, 2014 at 7:45 AM, Alexandre Rafalovitch wrote: > I like the idea. No comments about implementation, leave it to others. > > But if it is done, maybe somebody very familiar with logging can also > review Solr's current logging config. I suspect it is not optimized > for troubleshooting at this point. > > Regards, >Alex. > Personal website: http://www.outerthoughts.com/ > Current project: http://www.solr-start.com/ - Accelerating your Solr > proficiency > > > On Sat, Apr 5, 2014 at 3:16 AM, Gregg Donovan wrote: >> We have some metadata -- e.g. a request UUID -- that we log to every log >> line using Log4J's MDC [1]. The UUID logging allows us to connect any log >> lines we have for a given request across servers. Sort of like Zipkin [2]. >> >> Currently we're using EmbeddedSolrServer without sharding, so adding the >> UUID is fairly simple, since everything is in one process and one thread. >> But, we're testing a sharded HTTP implementation and running into some >> difficulties getting this data passed around in a way that lets us trace >> all log lines generated by a request to its UUID. >>
Re: Distributed tracing for Solr via adding HTTP headers?
I like the idea. No comments about implementation, leave it to others. But if it is done, maybe somebody very familiar with logging can also review Solr's current logging config. I suspect it is not optimized for troubleshooting at this point. Regards, Alex. Personal website: http://www.outerthoughts.com/ Current project: http://www.solr-start.com/ - Accelerating your Solr proficiency On Sat, Apr 5, 2014 at 3:16 AM, Gregg Donovan wrote: > We have some metadata -- e.g. a request UUID -- that we log to every log > line using Log4J's MDC [1]. The UUID logging allows us to connect any log > lines we have for a given request across servers. Sort of like Zipkin [2]. > > Currently we're using EmbeddedSolrServer without sharding, so adding the > UUID is fairly simple, since everything is in one process and one thread. > But, we're testing a sharded HTTP implementation and running into some > difficulties getting this data passed around in a way that lets us trace > all log lines generated by a request to its UUID. >
Distributed tracing for Solr via adding HTTP headers?
We have some metadata -- e.g. a request UUID -- that we log to every log line using Log4J's MDC [1]. The UUID logging allows us to connect any log lines we have for a given request across servers. Sort of like Zipkin [2]. Currently we're using EmbeddedSolrServer without sharding, so adding the UUID is fairly simple, since everything is in one process and one thread. But, we're testing a sharded HTTP implementation and running into some difficulties getting this data passed around in a way that lets us trace all log lines generated by a request to its UUID. The first thing I tried was to add the UUID by adding it to the SolrParams. This achieves the goal of getting those values logged on the shards if a request is successful, but we miss having those values in the MDC if there are other log lines before the final log line. E.g. an Exception in a custom component. My current thought is that sending HTTP headers with diagnostic information would be very useful. Those could be placed in the MDC even before handing off to work to SolrDispatchFilter, so that any Solr problem will have the proper logging. I.e. every additional header added to a Solr request gets a "Solr-" prefix. On the server, we look for those headers and add them to the SLF4J MDC[3]. Here's a patch [4] that does this that we're testing out. Is this a good idea? Would anyone else find this useful? If so, I'll open a ticket. --Gregg [1] http://logging.apache.org/log4j/1.2/apidocs/org/apache/log4j/MDC.html [2] http://twitter.github.io/zipkin/ [3] http://www.slf4j.org/api/org/slf4j/MDC.html [4] https://gist.github.com/greggdonovan/9982327