[ 
https://issues.apache.org/jira/browse/SOLR-13471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-13471:
----------------------------
    Description: 
Diagnoses/hypothosis summarized/re-worded from comment below...
https://issues.apache.org/jira/browse/SOLR-13471?focusedCommentId=16840999&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16840999

* Assume request execution happens successfully
* then, when the QueryResponseWriter goes to marshal the response, assume there 
is some exception writing the results -- perhaps due to the Resolver w/ 
ResultContext + Searcher (could be an IOException)
** the Exception from attempting to write the response may propogate all the 
way up to the "try" in HttpSolrCall.call() ... which once caught is then passed 
to "sendError"
** sendError creates a completely new SolrQueryResponse, sets the exception on 
it, and asks the QueryResponseWriter to write it out
*** *BUT* the OutputStream has already had a bunch of data written to it ... 
which may be resulting in a byte sequence that confuses the unmarshal code and 
may result in weird exceptions -- *or worse: validly structured, but corrupt 
data*

The fundemental problem is that when HttpSolrCall to "start over" writtng the 
response – but the JavaBinCodec doesn't have any way to recognize that ... it 
can read partial data and then be confused by the "new" data that comes after it

*Perhaps there should be a special "TAG" that means "Ignore everything you've 
already recieved and start over" that we should emit at the begining of every 
marshal() call?*

----
Original bug report
{panel}
I haven't dug into this, or been able to reproduce it (let alone get any 
additional logging/debugging/breakpoint info) but in a few very sporadic, very 
rare, instances of running TestReplicationHandlerDiskOverFlow, I triggered 
ClassCastException's in JavaBinCodec unmarshal code -- indicating that there is 
some disconnect in expectations between the marshal & unmarshal code paths...

{noformat}
   [junit4]   2> 13342 ERROR (Thread-19) [    ] 
o.a.s.h.TestReplicationHandlerDiskOverFlow Query Thread Failure
   [junit4]   2>           => 
org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error 
from server at http://127.0.0.1:54505/solr/collection1: class java.lang.Boolean 
cannot be cast to class java.lang.String (java.lang.Boolean and 
java.lang.String are in module java.base of loader 'bootstrap')
   [junit4]   2>        at 
org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:622)
   [junit4]   2> 
org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error 
from server at http://127.0.0.1:54505/solr/collection1: class java.lang.Boolean 
cannot be cast to class java.lang.String (java.lang.Boolean and 
java.lang.String are in module java.base of loader 'bootstrap')
   [junit4]   2>        at 
org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:622)
 ~[java/:?]
   [junit4]   2>        at 
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:255)
 ~[java/:?]
   [junit4]   2>        at 
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:244)
 ~[java/:?]
   [junit4]   2>        at 
org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:207) 
~[java/:?]
   [junit4]   2>        at 
org.apache.solr.client.solrj.SolrClient.query(SolrClient.java:987) ~[java/:?]
   [junit4]   2>        at 
org.apache.solr.client.solrj.SolrClient.query(SolrClient.java:1002) ~[java/:?]
   [junit4]   2>        at 
org.apache.solr.handler.TestReplicationHandlerDiskOverFlow.lambda$testDiskOverFlow$1(TestReplicationHandlerDiskOverFlow.java:154)
 ~[test/:?]
   [junit4]   2>        at java.lang.Thread.run(Thread.java:834) [?:?]
   [junit4]   2> Caused by: java.lang.ClassCastException: class 
java.lang.Boolean cannot be cast to class java.lang.String (java.lang.Boolean 
and java.lang.String are in module java.base of loader 'bootstrap')
   [junit4]   2>        at 
org.apache.solr.common.util.JavaBinCodec.readOrderedMap(JavaBinCodec.java:223) 
~[java/:?]
   [junit4]   2>        at 
org.apache.solr.common.util.JavaBinCodec.readObject(JavaBinCodec.java:298) 
~[java/:?]
   [junit4]   2>        at 
org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:280) 
~[java/:?]
   [junit4]   2>        at 
org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCodec.java:193) 
~[java/:?]
   [junit4]   2>        at 
org.apache.solr.client.solrj.impl.BinaryResponseParser.processResponse(BinaryResponseParser.java:50)
 ~[java/:?]
   [junit4]   2>        at 
org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:620)
 ~[java/:?]
   [junit4]   2>        ... 7 more
 
{noformat}

It's possible that something about the test code (or code being tested) is 
doing something it should not be doing, and adding something "unexpeted" to the 
response object -- but either the unmarshal code needs to be _as forgiving_ as 
marshal code, or the marshal code should fail fast -- not produce a binary 
stream that the unmarshal code can't parse.
{panel}

  was:
I haven't dug into this, or been able to reproduce it (let alone get any 
additional logging/debugging/breakpoint info) but in a few very sporadic, very 
rare, instances of running TestReplicationHandlerDiskOverFlow, I triggered 
ClassCastException's in JavaBinCodec unmarshal code -- indicating that there is 
some disconnect in expectations between the marshal & unmarshal code paths...

{noformat}
   [junit4]   2> 13342 ERROR (Thread-19) [    ] 
o.a.s.h.TestReplicationHandlerDiskOverFlow Query Thread Failure
   [junit4]   2>           => 
org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error 
from server at http://127.0.0.1:54505/solr/collection1: class java.lang.Boolean 
cannot be cast to class java.lang.String (java.lang.Boolean and 
java.lang.String are in module java.base of loader 'bootstrap')
   [junit4]   2>        at 
org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:622)
   [junit4]   2> 
org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error 
from server at http://127.0.0.1:54505/solr/collection1: class java.lang.Boolean 
cannot be cast to class java.lang.String (java.lang.Boolean and 
java.lang.String are in module java.base of loader 'bootstrap')
   [junit4]   2>        at 
org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:622)
 ~[java/:?]
   [junit4]   2>        at 
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:255)
 ~[java/:?]
   [junit4]   2>        at 
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:244)
 ~[java/:?]
   [junit4]   2>        at 
org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:207) 
~[java/:?]
   [junit4]   2>        at 
org.apache.solr.client.solrj.SolrClient.query(SolrClient.java:987) ~[java/:?]
   [junit4]   2>        at 
org.apache.solr.client.solrj.SolrClient.query(SolrClient.java:1002) ~[java/:?]
   [junit4]   2>        at 
org.apache.solr.handler.TestReplicationHandlerDiskOverFlow.lambda$testDiskOverFlow$1(TestReplicationHandlerDiskOverFlow.java:154)
 ~[test/:?]
   [junit4]   2>        at java.lang.Thread.run(Thread.java:834) [?:?]
   [junit4]   2> Caused by: java.lang.ClassCastException: class 
java.lang.Boolean cannot be cast to class java.lang.String (java.lang.Boolean 
and java.lang.String are in module java.base of loader 'bootstrap')
   [junit4]   2>        at 
org.apache.solr.common.util.JavaBinCodec.readOrderedMap(JavaBinCodec.java:223) 
~[java/:?]
   [junit4]   2>        at 
org.apache.solr.common.util.JavaBinCodec.readObject(JavaBinCodec.java:298) 
~[java/:?]
   [junit4]   2>        at 
org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:280) 
~[java/:?]
   [junit4]   2>        at 
org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCodec.java:193) 
~[java/:?]
   [junit4]   2>        at 
org.apache.solr.client.solrj.impl.BinaryResponseParser.processResponse(BinaryResponseParser.java:50)
 ~[java/:?]
   [junit4]   2>        at 
org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:620)
 ~[java/:?]
   [junit4]   2>        ... 7 more
 
{noformat}

It's possible that something about the test code (or code being tested) is 
doing something it should not be doing, and adding something "unexpeted" to the 
response object -- but either the unmarshal code needs to be _as forgiving_ as 
marshal code, or the marshal code should fail fast -- not produce a binary 
stream that the unmarshal code can't parse.

        Summary: exceptions during response writing can cause javabin to write 
a corrupt/missleading response that may caus exceptions or non-sense data when 
unmarshaled by javabin  (was: javabin codec can marshal data that it can't 
unmarshal (ClassCastException: class java.lang.Boolean cannot be cast to class 
java.lang.String))

> exceptions during response writing can cause javabin to write a 
> corrupt/missleading response that may caus exceptions or non-sense data when 
> unmarshaled by javabin
> -------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-13471
>                 URL: https://issues.apache.org/jira/browse/SOLR-13471
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>            Reporter: Hoss Man
>            Priority: Major
>         Attachments: SOLR-13471.patch, SOLR-13471.patch
>
>
> Diagnoses/hypothosis summarized/re-worded from comment below...
> https://issues.apache.org/jira/browse/SOLR-13471?focusedCommentId=16840999&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16840999
> * Assume request execution happens successfully
> * then, when the QueryResponseWriter goes to marshal the response, assume 
> there is some exception writing the results -- perhaps due to the Resolver w/ 
> ResultContext + Searcher (could be an IOException)
> ** the Exception from attempting to write the response may propogate all the 
> way up to the "try" in HttpSolrCall.call() ... which once caught is then 
> passed to "sendError"
> ** sendError creates a completely new SolrQueryResponse, sets the exception 
> on it, and asks the QueryResponseWriter to write it out
> *** *BUT* the OutputStream has already had a bunch of data written to it ... 
> which may be resulting in a byte sequence that confuses the unmarshal code 
> and may result in weird exceptions -- *or worse: validly structured, but 
> corrupt data*
> The fundemental problem is that when HttpSolrCall to "start over" writtng the 
> response – but the JavaBinCodec doesn't have any way to recognize that ... it 
> can read partial data and then be confused by the "new" data that comes after 
> it
> *Perhaps there should be a special "TAG" that means "Ignore everything you've 
> already recieved and start over" that we should emit at the begining of every 
> marshal() call?*
> ----
> Original bug report
> {panel}
> I haven't dug into this, or been able to reproduce it (let alone get any 
> additional logging/debugging/breakpoint info) but in a few very sporadic, 
> very rare, instances of running TestReplicationHandlerDiskOverFlow, I 
> triggered ClassCastException's in JavaBinCodec unmarshal code -- indicating 
> that there is some disconnect in expectations between the marshal & unmarshal 
> code paths...
> {noformat}
>    [junit4]   2> 13342 ERROR (Thread-19) [    ] 
> o.a.s.h.TestReplicationHandlerDiskOverFlow Query Thread Failure
>    [junit4]   2>           => 
> org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error 
> from server at http://127.0.0.1:54505/solr/collection1: class 
> java.lang.Boolean cannot be cast to class java.lang.String (java.lang.Boolean 
> and java.lang.String are in module java.base of loader 'bootstrap')
>    [junit4]   2>      at 
> org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:622)
>    [junit4]   2> 
> org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error 
> from server at http://127.0.0.1:54505/solr/collection1: class 
> java.lang.Boolean cannot be cast to class java.lang.String (java.lang.Boolean 
> and java.lang.String are in module java.base of loader 'bootstrap')
>    [junit4]   2>      at 
> org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:622)
>  ~[java/:?]
>    [junit4]   2>      at 
> org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:255)
>  ~[java/:?]
>    [junit4]   2>      at 
> org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:244)
>  ~[java/:?]
>    [junit4]   2>      at 
> org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:207) 
> ~[java/:?]
>    [junit4]   2>      at 
> org.apache.solr.client.solrj.SolrClient.query(SolrClient.java:987) ~[java/:?]
>    [junit4]   2>      at 
> org.apache.solr.client.solrj.SolrClient.query(SolrClient.java:1002) ~[java/:?]
>    [junit4]   2>      at 
> org.apache.solr.handler.TestReplicationHandlerDiskOverFlow.lambda$testDiskOverFlow$1(TestReplicationHandlerDiskOverFlow.java:154)
>  ~[test/:?]
>    [junit4]   2>      at java.lang.Thread.run(Thread.java:834) [?:?]
>    [junit4]   2> Caused by: java.lang.ClassCastException: class 
> java.lang.Boolean cannot be cast to class java.lang.String (java.lang.Boolean 
> and java.lang.String are in module java.base of loader 'bootstrap')
>    [junit4]   2>      at 
> org.apache.solr.common.util.JavaBinCodec.readOrderedMap(JavaBinCodec.java:223)
>  ~[java/:?]
>    [junit4]   2>      at 
> org.apache.solr.common.util.JavaBinCodec.readObject(JavaBinCodec.java:298) 
> ~[java/:?]
>    [junit4]   2>      at 
> org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:280) 
> ~[java/:?]
>    [junit4]   2>      at 
> org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCodec.java:193) 
> ~[java/:?]
>    [junit4]   2>      at 
> org.apache.solr.client.solrj.impl.BinaryResponseParser.processResponse(BinaryResponseParser.java:50)
>  ~[java/:?]
>    [junit4]   2>      at 
> org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:620)
>  ~[java/:?]
>    [junit4]   2>      ... 7 more
>  
> {noformat}
> It's possible that something about the test code (or code being tested) is 
> doing something it should not be doing, and adding something "unexpeted" to 
> the response object -- but either the unmarshal code needs to be _as 
> forgiving_ as marshal code, or the marshal code should fail fast -- not 
> produce a binary stream that the unmarshal code can't parse.
> {panel}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to