[ 
https://issues.apache.org/jira/browse/SOLR-13471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16846076#comment-16846076
 ] 

Hoss Man commented on SOLR-13471:
---------------------------------

{quote}The only possible possible solution is to modify the format to add a new 
tag that says "discard everything you've seen till now, start from the 
beginning"
{quote}
Or maybe harden the marshal'ing code so that if/when an exception happens it 
"closes" the current object/sequence in some way ... ie: if there is an 
exception while writting a SolrDocument, catch that exceptio, write the "end of 
document" and then re-throw up to the "writeSolrDocumentList" which can write 
the "end of doclist" and then re-throw, etc...  but i'm not sure how that would 
work given that javabin seems (IIUC) to largely be designed around writting out 
the "size" of things (like lists, maps, strings, etc...) as part of the tag.

it seems like the advantages of having the sizes encoded in the start tags 
would largely be negated if the unmarshal'ing code had to constantly check for 
"end of foo" or "start over entire input" type markers.
{quote}How do we plan to solve it in JSON?
{quote}
see the techproduct example outut i posted above ... JSON and XML aren't 
affected by this ... allthough ... now that you mention it, i'm not entirely 
sure _why_ they aren't affected by this?  I had this vague notion that it had 
to do with how those formats inherently include start & end "tags" for 
everything, so even if there is an exception writting a document, they still 
output the {{</doc>}} or {{}}} markup .. but that doesn't actaully jive with 
what's being output above ...somehow/somewhere it's buffering the response and 
throwing it away when it has an exception? (i haven't dug in to see where)

> exceptions during response writing can cause javabin to write a 
> corrupt/missleading response that may caus exceptions or non-sense data when 
> unmarshaled by javabin
> -------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-13471
>                 URL: https://issues.apache.org/jira/browse/SOLR-13471
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>            Reporter: Hoss Man
>            Priority: Major
>         Attachments: SOLR-13471.patch, SOLR-13471.patch, SOLR-13471.patch
>
>
> Diagnoses/hypothosis summarized/re-worded from comment below...
> https://issues.apache.org/jira/browse/SOLR-13471?focusedCommentId=16840999&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16840999
> * Assume request execution happens successfully
> * then, when the QueryResponseWriter goes to marshal the response, assume 
> there is some exception writing the results -- perhaps due to the Resolver w/ 
> ResultContext + Searcher (could be an IOException)
> ** the Exception from attempting to write the response may propogate all the 
> way up to the "try" in HttpSolrCall.call() ... which once caught is then 
> passed to "sendError"
> ** sendError creates a completely new SolrQueryResponse, sets the exception 
> on it, and asks the QueryResponseWriter to write it out
> *** *BUT* the OutputStream has already had a bunch of data written to it ... 
> which may be resulting in a byte sequence that confuses the unmarshal code 
> and may result in weird exceptions -- *or worse: validly structured, but 
> corrupt data*
> The fundemental problem is that when HttpSolrCall to "start over" writtng the 
> response – but the JavaBinCodec doesn't have any way to recognize that ... it 
> can read partial data and then be confused by the "new" data that comes after 
> it
> *Perhaps there should be a special "TAG" that means "Ignore everything you've 
> already recieved and start over" that we should emit at the begining of every 
> marshal() call?*
> ----
> Original bug report
> {panel}
> I haven't dug into this, or been able to reproduce it (let alone get any 
> additional logging/debugging/breakpoint info) but in a few very sporadic, 
> very rare, instances of running TestReplicationHandlerDiskOverFlow, I 
> triggered ClassCastException's in JavaBinCodec unmarshal code -- indicating 
> that there is some disconnect in expectations between the marshal & unmarshal 
> code paths...
> {noformat}
>    [junit4]   2> 13342 ERROR (Thread-19) [    ] 
> o.a.s.h.TestReplicationHandlerDiskOverFlow Query Thread Failure
>    [junit4]   2>           => 
> org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error 
> from server at http://127.0.0.1:54505/solr/collection1: class 
> java.lang.Boolean cannot be cast to class java.lang.String (java.lang.Boolean 
> and java.lang.String are in module java.base of loader 'bootstrap')
>    [junit4]   2>      at 
> org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:622)
>    [junit4]   2> 
> org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error 
> from server at http://127.0.0.1:54505/solr/collection1: class 
> java.lang.Boolean cannot be cast to class java.lang.String (java.lang.Boolean 
> and java.lang.String are in module java.base of loader 'bootstrap')
>    [junit4]   2>      at 
> org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:622)
>  ~[java/:?]
>    [junit4]   2>      at 
> org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:255)
>  ~[java/:?]
>    [junit4]   2>      at 
> org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:244)
>  ~[java/:?]
>    [junit4]   2>      at 
> org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:207) 
> ~[java/:?]
>    [junit4]   2>      at 
> org.apache.solr.client.solrj.SolrClient.query(SolrClient.java:987) ~[java/:?]
>    [junit4]   2>      at 
> org.apache.solr.client.solrj.SolrClient.query(SolrClient.java:1002) ~[java/:?]
>    [junit4]   2>      at 
> org.apache.solr.handler.TestReplicationHandlerDiskOverFlow.lambda$testDiskOverFlow$1(TestReplicationHandlerDiskOverFlow.java:154)
>  ~[test/:?]
>    [junit4]   2>      at java.lang.Thread.run(Thread.java:834) [?:?]
>    [junit4]   2> Caused by: java.lang.ClassCastException: class 
> java.lang.Boolean cannot be cast to class java.lang.String (java.lang.Boolean 
> and java.lang.String are in module java.base of loader 'bootstrap')
>    [junit4]   2>      at 
> org.apache.solr.common.util.JavaBinCodec.readOrderedMap(JavaBinCodec.java:223)
>  ~[java/:?]
>    [junit4]   2>      at 
> org.apache.solr.common.util.JavaBinCodec.readObject(JavaBinCodec.java:298) 
> ~[java/:?]
>    [junit4]   2>      at 
> org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:280) 
> ~[java/:?]
>    [junit4]   2>      at 
> org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCodec.java:193) 
> ~[java/:?]
>    [junit4]   2>      at 
> org.apache.solr.client.solrj.impl.BinaryResponseParser.processResponse(BinaryResponseParser.java:50)
>  ~[java/:?]
>    [junit4]   2>      at 
> org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:620)
>  ~[java/:?]
>    [junit4]   2>      ... 7 more
>  
> {noformat}
> It's possible that something about the test code (or code being tested) is 
> doing something it should not be doing, and adding something "unexpeted" to 
> the response object -- but either the unmarshal code needs to be _as 
> forgiving_ as marshal code, or the marshal code should fail fast -- not 
> produce a binary stream that the unmarshal code can't parse.
> {panel}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to