Re: Distributed tracing for Solr via adding HTTP headers?

2014-04-07 Thread Steve Davids
I have had this exact same use case and we ended up just setting a header 
value, then in a Servlet Filter we read the header value and set the MDC 
property within the filter. By reading the header value it didn’t complain 
about reading the request before making it to the SolrDispatchFilter. We used 
the Jetty web defaults to jam this functionality at the beginning of the 
servlet processing chain without having to crack open the war.

-Steve

On Apr 7, 2014, at 8:01 PM, Michael Sokolov  
wrote:

> Yes, I see.  SolrDispatchFilter is  - not really written with extensibility 
> in mind.
> 
> -Mike
> 
> On 4/7/14 3:50 PM, Gregg Donovan wrote:
>> Michael,
>> 
>> Thanks! Unfortunately, as we use POSTs, that approach would trigger the
>> getParameterIncompatibilityException call due to the Enumeration of
>> getParameterNames before SolrDispatchFilter has a chance to access the
>> InputStream.
>> 
>> I opened https://issues.apache.org/jira/browse/SOLR-5969 to discuss further
>> and attached our current patch.
>> 
>> 
>> On Mon, Apr 7, 2014 at 2:02 PM, Michael Sokolov <
>> msoko...@safaribooksonline.com> wrote:
>> 
>>> I had to grapple with something like this problem when I wrote Lux's
>>> app-server.  I extended SolrDispatchFilter and handle parameter swizzling
>>> to keep everything nicey-nicey for Solr while being able to play games with
>>> parameters of my own.  Perhaps this will give you some ideas:
>>> 
>>> https://github.com/msokolov/lux/blob/master/src/main/java/
>>> lux/solr/LuxDispatchFilter.java
>>> 
>>> It's definitely hackish, but seems to get the job done - for me - it's not
>>> a reusable component, but might serve as an illustration of one way to
>>> handle the problem
>>> 
>>> -Mike
>>> 
>>> 
>>> On 04/07/2014 12:23 PM, Gregg Donovan wrote:
>>> 
 That was my first attempt, but it's much trickier than I anticipated.
 
 A filter that calls HttpServletRequest#getParameter() before
 SolrDispatchFilter will trigger an exception  -- see
 getParameterIncompatibilityException [1] -- if the request is a POST. It
 seems that Solr depends on the configured per-core SolrRequestParser to
 properly parse the request parameters. A servlet filter that came before
 SolrDispatchFilter would need to fetch the correct SolrRequestParser for
 the requested core, parse the request, and reset the InputStream before
 pulling the data into the MDC. It also duplicates the work of request
 parsing. It's especially tricky if you want to remove the tracing
 parameters from the SolrParams and just have them in the MDC to avoid them
 being logged twice.
 
 
 [1]
 https://github.com/apache/lucene-solr/blob/trunk/solr/
 core/src/java/org/apache/solr/servlet/SolrRequestParsers.java#L621:L628
 
 
 On Sun, Apr 6, 2014 at 2:20 PM, Alexandre Rafalovitch  wrote:
  On the second thought,
> If you are already managing to pass the value using the request
> parameters, what stops you from just having a servlet filter looking
> for that parameter and assigning it directly to the MDC context?
> 
> Regards,
> Alex.
> Personal website: http://www.outerthoughts.com/
> Current project: http://www.solr-start.com/ - Accelerating your Solr
> proficiency
> 
> 
> On Sat, Apr 5, 2014 at 7:45 AM, Alexandre Rafalovitch
>  wrote:
> 
>> I like the idea. No comments about implementation, leave it to others.
>> 
>> But if it is done, maybe somebody very familiar with logging can also
>> review Solr's current logging config. I suspect it is not optimized
>> for troubleshooting at this point.
>> 
>> Regards,
>> Alex.
>> Personal website: http://www.outerthoughts.com/
>> Current project: http://www.solr-start.com/ - Accelerating your Solr
>> 
> proficiency
> 
>> On Sat, Apr 5, 2014 at 3:16 AM, Gregg Donovan 
>> 
> wrote:
> 
>> We have some metadata -- e.g. a request UUID -- that we log to every log
>>> line using Log4J's MDC [1]. The UUID logging allows us to connect any
>>> 
>> log
>> lines we have for a given request across servers. Sort of like Zipkin
>> [2].
>> Currently we're using EmbeddedSolrServer without sharding, so adding the
>>> UUID is fairly simple, since everything is in one process and one
>>> 
>> thread.
>> But, we're testing a sharded HTTP implementation and running into some
>>> difficulties getting this data passed around in a way that lets us
>>> trace
>>> all log lines generated by a request to its UUID.
>>> 
>>> 
> 



Re: Distributed tracing for Solr via adding HTTP headers?

2014-04-07 Thread Michael Sokolov
Yes, I see.  SolrDispatchFilter is  - not really written with 
extensibility in mind.


-Mike

On 4/7/14 3:50 PM, Gregg Donovan wrote:

Michael,

Thanks! Unfortunately, as we use POSTs, that approach would trigger the
getParameterIncompatibilityException call due to the Enumeration of
getParameterNames before SolrDispatchFilter has a chance to access the
InputStream.

I opened https://issues.apache.org/jira/browse/SOLR-5969 to discuss further
and attached our current patch.


On Mon, Apr 7, 2014 at 2:02 PM, Michael Sokolov <
msoko...@safaribooksonline.com> wrote:


I had to grapple with something like this problem when I wrote Lux's
app-server.  I extended SolrDispatchFilter and handle parameter swizzling
to keep everything nicey-nicey for Solr while being able to play games with
parameters of my own.  Perhaps this will give you some ideas:

https://github.com/msokolov/lux/blob/master/src/main/java/
lux/solr/LuxDispatchFilter.java

It's definitely hackish, but seems to get the job done - for me - it's not
a reusable component, but might serve as an illustration of one way to
handle the problem

-Mike


On 04/07/2014 12:23 PM, Gregg Donovan wrote:


That was my first attempt, but it's much trickier than I anticipated.

A filter that calls HttpServletRequest#getParameter() before
SolrDispatchFilter will trigger an exception  -- see
getParameterIncompatibilityException [1] -- if the request is a POST. It
seems that Solr depends on the configured per-core SolrRequestParser to
properly parse the request parameters. A servlet filter that came before
SolrDispatchFilter would need to fetch the correct SolrRequestParser for
the requested core, parse the request, and reset the InputStream before
pulling the data into the MDC. It also duplicates the work of request
parsing. It's especially tricky if you want to remove the tracing
parameters from the SolrParams and just have them in the MDC to avoid them
being logged twice.


[1]
https://github.com/apache/lucene-solr/blob/trunk/solr/
core/src/java/org/apache/solr/servlet/SolrRequestParsers.java#L621:L628


On Sun, Apr 6, 2014 at 2:20 PM, Alexandre Rafalovitch 
wrote:

  On the second thought,

If you are already managing to pass the value using the request
parameters, what stops you from just having a servlet filter looking
for that parameter and assigning it directly to the MDC context?

Regards,
 Alex.
Personal website: http://www.outerthoughts.com/
Current project: http://www.solr-start.com/ - Accelerating your Solr
proficiency


On Sat, Apr 5, 2014 at 7:45 AM, Alexandre Rafalovitch
 wrote:


I like the idea. No comments about implementation, leave it to others.

But if it is done, maybe somebody very familiar with logging can also
review Solr's current logging config. I suspect it is not optimized
for troubleshooting at this point.

Regards,
 Alex.
Personal website: http://www.outerthoughts.com/
Current project: http://www.solr-start.com/ - Accelerating your Solr


proficiency


On Sat, Apr 5, 2014 at 3:16 AM, Gregg Donovan 


wrote:


We have some metadata -- e.g. a request UUID -- that we log to every log

line using Log4J's MDC [1]. The UUID logging allows us to connect any


log
lines we have for a given request across servers. Sort of like Zipkin
[2].
Currently we're using EmbeddedSolrServer without sharding, so adding the

UUID is fairly simple, since everything is in one process and one


thread.
But, we're testing a sharded HTTP implementation and running into some

difficulties getting this data passed around in a way that lets us
trace
all log lines generated by a request to its UUID.






Re: Distributed tracing for Solr via adding HTTP headers?

2014-04-07 Thread Gregg Donovan
Michael,

Thanks! Unfortunately, as we use POSTs, that approach would trigger the
getParameterIncompatibilityException call due to the Enumeration of
getParameterNames before SolrDispatchFilter has a chance to access the
InputStream.

I opened https://issues.apache.org/jira/browse/SOLR-5969 to discuss further
and attached our current patch.


On Mon, Apr 7, 2014 at 2:02 PM, Michael Sokolov <
msoko...@safaribooksonline.com> wrote:

> I had to grapple with something like this problem when I wrote Lux's
> app-server.  I extended SolrDispatchFilter and handle parameter swizzling
> to keep everything nicey-nicey for Solr while being able to play games with
> parameters of my own.  Perhaps this will give you some ideas:
>
> https://github.com/msokolov/lux/blob/master/src/main/java/
> lux/solr/LuxDispatchFilter.java
>
> It's definitely hackish, but seems to get the job done - for me - it's not
> a reusable component, but might serve as an illustration of one way to
> handle the problem
>
> -Mike
>
>
> On 04/07/2014 12:23 PM, Gregg Donovan wrote:
>
>> That was my first attempt, but it's much trickier than I anticipated.
>>
>> A filter that calls HttpServletRequest#getParameter() before
>> SolrDispatchFilter will trigger an exception  -- see
>> getParameterIncompatibilityException [1] -- if the request is a POST. It
>> seems that Solr depends on the configured per-core SolrRequestParser to
>> properly parse the request parameters. A servlet filter that came before
>> SolrDispatchFilter would need to fetch the correct SolrRequestParser for
>> the requested core, parse the request, and reset the InputStream before
>> pulling the data into the MDC. It also duplicates the work of request
>> parsing. It's especially tricky if you want to remove the tracing
>> parameters from the SolrParams and just have them in the MDC to avoid them
>> being logged twice.
>>
>>
>> [1]
>> https://github.com/apache/lucene-solr/blob/trunk/solr/
>> core/src/java/org/apache/solr/servlet/SolrRequestParsers.java#L621:L628
>>
>>
>> On Sun, Apr 6, 2014 at 2:20 PM, Alexandre Rafalovitch > >wrote:
>>
>>  On the second thought,
>>>
>>> If you are already managing to pass the value using the request
>>> parameters, what stops you from just having a servlet filter looking
>>> for that parameter and assigning it directly to the MDC context?
>>>
>>> Regards,
>>> Alex.
>>> Personal website: http://www.outerthoughts.com/
>>> Current project: http://www.solr-start.com/ - Accelerating your Solr
>>> proficiency
>>>
>>>
>>> On Sat, Apr 5, 2014 at 7:45 AM, Alexandre Rafalovitch
>>>  wrote:
>>>
 I like the idea. No comments about implementation, leave it to others.

 But if it is done, maybe somebody very familiar with logging can also
 review Solr's current logging config. I suspect it is not optimized
 for troubleshooting at this point.

 Regards,
 Alex.
 Personal website: http://www.outerthoughts.com/
 Current project: http://www.solr-start.com/ - Accelerating your Solr

>>> proficiency
>>>

 On Sat, Apr 5, 2014 at 3:16 AM, Gregg Donovan 

>>> wrote:
>>>
 We have some metadata -- e.g. a request UUID -- that we log to every log
> line using Log4J's MDC [1]. The UUID logging allows us to connect any
>
 log
>>>
 lines we have for a given request across servers. Sort of like Zipkin
>
 [2].
>>>
 Currently we're using EmbeddedSolrServer without sharding, so adding the
> UUID is fairly simple, since everything is in one process and one
>
 thread.
>>>
 But, we're testing a sharded HTTP implementation and running into some
> difficulties getting this data passed around in a way that lets us
> trace
> all log lines generated by a request to its UUID.
>
>
>


Re: Distributed tracing for Solr via adding HTTP headers?

2014-04-07 Thread Michael Sokolov
I had to grapple with something like this problem when I wrote Lux's 
app-server.  I extended SolrDispatchFilter and handle parameter 
swizzling to keep everything nicey-nicey for Solr while being able to 
play games with parameters of my own.  Perhaps this will give you some 
ideas:


https://github.com/msokolov/lux/blob/master/src/main/java/lux/solr/LuxDispatchFilter.java

It's definitely hackish, but seems to get the job done - for me - it's 
not a reusable component, but might serve as an illustration of one way 
to handle the problem


-Mike

On 04/07/2014 12:23 PM, Gregg Donovan wrote:

That was my first attempt, but it's much trickier than I anticipated.

A filter that calls HttpServletRequest#getParameter() before
SolrDispatchFilter will trigger an exception  -- see
getParameterIncompatibilityException [1] -- if the request is a POST. It
seems that Solr depends on the configured per-core SolrRequestParser to
properly parse the request parameters. A servlet filter that came before
SolrDispatchFilter would need to fetch the correct SolrRequestParser for
the requested core, parse the request, and reset the InputStream before
pulling the data into the MDC. It also duplicates the work of request
parsing. It's especially tricky if you want to remove the tracing
parameters from the SolrParams and just have them in the MDC to avoid them
being logged twice.


[1]
https://github.com/apache/lucene-solr/blob/trunk/solr/core/src/java/org/apache/solr/servlet/SolrRequestParsers.java#L621:L628


On Sun, Apr 6, 2014 at 2:20 PM, Alexandre Rafalovitch wrote:


On the second thought,

If you are already managing to pass the value using the request
parameters, what stops you from just having a servlet filter looking
for that parameter and assigning it directly to the MDC context?

Regards,
Alex.
Personal website: http://www.outerthoughts.com/
Current project: http://www.solr-start.com/ - Accelerating your Solr
proficiency


On Sat, Apr 5, 2014 at 7:45 AM, Alexandre Rafalovitch
 wrote:

I like the idea. No comments about implementation, leave it to others.

But if it is done, maybe somebody very familiar with logging can also
review Solr's current logging config. I suspect it is not optimized
for troubleshooting at this point.

Regards,
Alex.
Personal website: http://www.outerthoughts.com/
Current project: http://www.solr-start.com/ - Accelerating your Solr

proficiency


On Sat, Apr 5, 2014 at 3:16 AM, Gregg Donovan 

wrote:

We have some metadata -- e.g. a request UUID -- that we log to every log
line using Log4J's MDC [1]. The UUID logging allows us to connect any

log

lines we have for a given request across servers. Sort of like Zipkin

[2].

Currently we're using EmbeddedSolrServer without sharding, so adding the
UUID is fairly simple, since everything is in one process and one

thread.

But, we're testing a sharded HTTP implementation and running into some
difficulties getting this data passed around in a way that lets us trace
all log lines generated by a request to its UUID.





Re: Distributed tracing for Solr via adding HTTP headers?

2014-04-07 Thread Alexandre Rafalovitch
So to rephrase:

Solr will barf at unknown parameters, so we cannot currently send them in
band.

And the out of band dies not work due to post body handling complexity.

You are proposing effectively a dynamic set with common prefix to stop the
complaints. Plus the code to propagate those params.

Is that a good general description? I am just wondering if this can be
matched to some other real issues as well.

Regards,
 Alex
On 07/04/2014 11:23 pm, "Gregg Donovan"  wrote:

> That was my first attempt, but it's much trickier than I anticipated.
>
> A filter that calls HttpServletRequest#getParameter() before
> SolrDispatchFilter will trigger an exception  -- see
> getParameterIncompatibilityException [1] -- if the request is a POST. It
> seems that Solr depends on the configured per-core SolrRequestParser to
> properly parse the request parameters. A servlet filter that came before
> SolrDispatchFilter would need to fetch the correct SolrRequestParser for
> the requested core, parse the request, and reset the InputStream before
> pulling the data into the MDC. It also duplicates the work of request
> parsing. It's especially tricky if you want to remove the tracing
> parameters from the SolrParams and just have them in the MDC to avoid them
> being logged twice.
>
>
> [1]
>
> https://github.com/apache/lucene-solr/blob/trunk/solr/core/src/java/org/apache/solr/servlet/SolrRequestParsers.java#L621:L628
>
>
> On Sun, Apr 6, 2014 at 2:20 PM, Alexandre Rafalovitch  >wrote:
>
> > On the second thought,
> >
> > If you are already managing to pass the value using the request
> > parameters, what stops you from just having a servlet filter looking
> > for that parameter and assigning it directly to the MDC context?
> >
> > Regards,
> >Alex.
> > Personal website: http://www.outerthoughts.com/
> > Current project: http://www.solr-start.com/ - Accelerating your Solr
> > proficiency
> >
> >
> > On Sat, Apr 5, 2014 at 7:45 AM, Alexandre Rafalovitch
> >  wrote:
> > > I like the idea. No comments about implementation, leave it to others.
> > >
> > > But if it is done, maybe somebody very familiar with logging can also
> > > review Solr's current logging config. I suspect it is not optimized
> > > for troubleshooting at this point.
> > >
> > > Regards,
> > >Alex.
> > > Personal website: http://www.outerthoughts.com/
> > > Current project: http://www.solr-start.com/ - Accelerating your Solr
> > proficiency
> > >
> > >
> > > On Sat, Apr 5, 2014 at 3:16 AM, Gregg Donovan 
> > wrote:
> > >> We have some metadata -- e.g. a request UUID -- that we log to every
> log
> > >> line using Log4J's MDC [1]. The UUID logging allows us to connect any
> > log
> > >> lines we have for a given request across servers. Sort of like Zipkin
> > [2].
> > >>
> > >> Currently we're using EmbeddedSolrServer without sharding, so adding
> the
> > >> UUID is fairly simple, since everything is in one process and one
> > thread.
> > >> But, we're testing a sharded HTTP implementation and running into some
> > >> difficulties getting this data passed around in a way that lets us
> trace
> > >> all log lines generated by a request to its UUID.
> > >>
> >
>


Re: Distributed tracing for Solr via adding HTTP headers?

2014-04-07 Thread Gregg Donovan
That was my first attempt, but it's much trickier than I anticipated.

A filter that calls HttpServletRequest#getParameter() before
SolrDispatchFilter will trigger an exception  -- see
getParameterIncompatibilityException [1] -- if the request is a POST. It
seems that Solr depends on the configured per-core SolrRequestParser to
properly parse the request parameters. A servlet filter that came before
SolrDispatchFilter would need to fetch the correct SolrRequestParser for
the requested core, parse the request, and reset the InputStream before
pulling the data into the MDC. It also duplicates the work of request
parsing. It's especially tricky if you want to remove the tracing
parameters from the SolrParams and just have them in the MDC to avoid them
being logged twice.


[1]
https://github.com/apache/lucene-solr/blob/trunk/solr/core/src/java/org/apache/solr/servlet/SolrRequestParsers.java#L621:L628


On Sun, Apr 6, 2014 at 2:20 PM, Alexandre Rafalovitch wrote:

> On the second thought,
>
> If you are already managing to pass the value using the request
> parameters, what stops you from just having a servlet filter looking
> for that parameter and assigning it directly to the MDC context?
>
> Regards,
>Alex.
> Personal website: http://www.outerthoughts.com/
> Current project: http://www.solr-start.com/ - Accelerating your Solr
> proficiency
>
>
> On Sat, Apr 5, 2014 at 7:45 AM, Alexandre Rafalovitch
>  wrote:
> > I like the idea. No comments about implementation, leave it to others.
> >
> > But if it is done, maybe somebody very familiar with logging can also
> > review Solr's current logging config. I suspect it is not optimized
> > for troubleshooting at this point.
> >
> > Regards,
> >Alex.
> > Personal website: http://www.outerthoughts.com/
> > Current project: http://www.solr-start.com/ - Accelerating your Solr
> proficiency
> >
> >
> > On Sat, Apr 5, 2014 at 3:16 AM, Gregg Donovan 
> wrote:
> >> We have some metadata -- e.g. a request UUID -- that we log to every log
> >> line using Log4J's MDC [1]. The UUID logging allows us to connect any
> log
> >> lines we have for a given request across servers. Sort of like Zipkin
> [2].
> >>
> >> Currently we're using EmbeddedSolrServer without sharding, so adding the
> >> UUID is fairly simple, since everything is in one process and one
> thread.
> >> But, we're testing a sharded HTTP implementation and running into some
> >> difficulties getting this data passed around in a way that lets us trace
> >> all log lines generated by a request to its UUID.
> >>
>


Re: Distributed tracing for Solr via adding HTTP headers?

2014-04-06 Thread Alexandre Rafalovitch
On the second thought,

If you are already managing to pass the value using the request
parameters, what stops you from just having a servlet filter looking
for that parameter and assigning it directly to the MDC context?

Regards,
   Alex.
Personal website: http://www.outerthoughts.com/
Current project: http://www.solr-start.com/ - Accelerating your Solr proficiency


On Sat, Apr 5, 2014 at 7:45 AM, Alexandre Rafalovitch
 wrote:
> I like the idea. No comments about implementation, leave it to others.
>
> But if it is done, maybe somebody very familiar with logging can also
> review Solr's current logging config. I suspect it is not optimized
> for troubleshooting at this point.
>
> Regards,
>Alex.
> Personal website: http://www.outerthoughts.com/
> Current project: http://www.solr-start.com/ - Accelerating your Solr 
> proficiency
>
>
> On Sat, Apr 5, 2014 at 3:16 AM, Gregg Donovan  wrote:
>> We have some metadata -- e.g. a request UUID -- that we log to every log
>> line using Log4J's MDC [1]. The UUID logging allows us to connect any log
>> lines we have for a given request across servers. Sort of like Zipkin [2].
>>
>> Currently we're using EmbeddedSolrServer without sharding, so adding the
>> UUID is fairly simple, since everything is in one process and one thread.
>> But, we're testing a sharded HTTP implementation and running into some
>> difficulties getting this data passed around in a way that lets us trace
>> all log lines generated by a request to its UUID.
>>


Re: Distributed tracing for Solr via adding HTTP headers?

2014-04-04 Thread Alexandre Rafalovitch
I like the idea. No comments about implementation, leave it to others.

But if it is done, maybe somebody very familiar with logging can also
review Solr's current logging config. I suspect it is not optimized
for troubleshooting at this point.

Regards,
   Alex.
Personal website: http://www.outerthoughts.com/
Current project: http://www.solr-start.com/ - Accelerating your Solr proficiency


On Sat, Apr 5, 2014 at 3:16 AM, Gregg Donovan  wrote:
> We have some metadata -- e.g. a request UUID -- that we log to every log
> line using Log4J's MDC [1]. The UUID logging allows us to connect any log
> lines we have for a given request across servers. Sort of like Zipkin [2].
>
> Currently we're using EmbeddedSolrServer without sharding, so adding the
> UUID is fairly simple, since everything is in one process and one thread.
> But, we're testing a sharded HTTP implementation and running into some
> difficulties getting this data passed around in a way that lets us trace
> all log lines generated by a request to its UUID.
>


Distributed tracing for Solr via adding HTTP headers?

2014-04-04 Thread Gregg Donovan
We have some metadata -- e.g. a request UUID -- that we log to every log
line using Log4J's MDC [1]. The UUID logging allows us to connect any log
lines we have for a given request across servers. Sort of like Zipkin [2].

Currently we're using EmbeddedSolrServer without sharding, so adding the
UUID is fairly simple, since everything is in one process and one thread.
But, we're testing a sharded HTTP implementation and running into some
difficulties getting this data passed around in a way that lets us trace
all log lines generated by a request to its UUID.

The first thing I tried was to add the UUID by adding it to the SolrParams.
This achieves the goal of getting those values logged on the shards if a
request is successful, but we miss having those values in the MDC if there
are other log lines before the final log line. E.g. an Exception in a
custom component.

My current thought is that sending HTTP headers with diagnostic information
would be very useful. Those could be placed in the MDC even before handing
off to work to SolrDispatchFilter, so that any Solr problem will have the
proper logging.

I.e. every additional header added to a Solr request gets a "Solr-" prefix.
On the server, we look for those headers and add them to the SLF4J MDC[3].

Here's a patch [4] that does this that we're testing out. Is this a good
idea? Would anyone else find this useful? If so, I'll open a ticket.

--Gregg

[1] http://logging.apache.org/log4j/1.2/apidocs/org/apache/log4j/MDC.html
[2] http://twitter.github.io/zipkin/
[3] http://www.slf4j.org/api/org/slf4j/MDC.html
[4] https://gist.github.com/greggdonovan/9982327