[ 
https://issues.apache.org/jira/browse/SOLR-12798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16627028#comment-16627028
 ] 

Karl Wright commented on SOLR-12798:
------------------------------------

[~noble.paul] We have a custom implementation because SolrJ and indeed 
HttpComponents/HttpClient have problems we're forced to work around.  These 
have been raised before but so far not taken too seriously apparently.  The 
need to workaround things has gotten even more significant with the latest 
release.

ModifiedHttpSolrClient is a derivation of HttpSolrClient.  The method 
overridden, createMethod(), is a direct copy of HttpSolrClient.createMethod() 
with certain very specific changes.  These are apparently all still necessary.  
I've included the method code below. 

If I disable this custom method, and use standard code, I *never* get multipart 
form posts at all.  That is unacceptable in this application.

With the current modifications included below, I get multipart posts for 
everything, including for deletions, which breaks because Solr doesn't like 
that.

I'm asking for advice as to how to get multipart posts only for documents, 
either ones transmitted by ContentStreamUpdateHandler or 
UpdateHandler.add(SolrInputDocument).

{code}
  @Override
  protected HttpRequestBase createMethod(SolrRequest request, String 
collection) throws IOException, SolrServerException {
    if (request instanceof V2RequestSupport) {
      request = ((V2RequestSupport) request).getV2Request();
    }
    SolrParams params = request.getParams();
    RequestWriter.ContentWriter contentWriter = 
requestWriter.getContentWriter(request);
    Collection<ContentStream> streams = contentWriter == null ? 
requestWriter.getContentStreams(request) : null;
    String path = requestWriter.getPath(request);
    if (path == null || !path.startsWith("/")) {
      path = DEFAULT_PATH;
    }
    
    ResponseParser parser = request.getResponseParser();
    if (parser == null) {
      parser = this.parser;
    }
    
    // The parser 'wt=' and 'version=' params are used instead of the original
    // params
    ModifiableSolrParams wparams = new ModifiableSolrParams(params);
    if (parser != null) {
      wparams.set(CommonParams.WT, parser.getWriterType());
      wparams.set(CommonParams.VERSION, parser.getVersion());
    }
    if (invariantParams != null) {
      wparams.add(invariantParams);
    }

    String basePath = baseUrl;
    if (collection != null)
      basePath += "/" + collection;

    if (request instanceof V2Request) {
      if (System.getProperty("solr.v2RealPath") == null) {
        basePath = baseUrl.replace("/solr", "/api");
      } else {
        basePath = baseUrl + "/____v2";
      }
    }

    if (SolrRequest.METHOD.GET == request.getMethod()) {
      if (streams != null || contentWriter != null) {
        throw new SolrException(SolrException.ErrorCode.BAD_REQUEST, "GET can't 
send streams!");
      }

      return new HttpGet(basePath + path + toQueryString(wparams, false));
    }

    if (SolrRequest.METHOD.DELETE == request.getMethod()) {
      return new HttpDelete(basePath + path + toQueryString(wparams, false));
    }

    if (SolrRequest.METHOD.POST == request.getMethod() || 
SolrRequest.METHOD.PUT == request.getMethod()) {

      // UpdateRequest uses PUT now, and ContentStreamUpdateHandler uses POST.
      // We must override PUT with POST if multipart is on.
      // If useMultipart is on, we fall back to getting streams directly from 
the request.
      final boolean mustUseMultipart;
      final SolrRequest.METHOD methodToUse;
      if (this.useMultiPartPost) {
        final Collection<ContentStream> requestStreams = 
request.getContentStreams();
        mustUseMultipart = requestStreams != null && requestStreams.size() > 0;
        if (mustUseMultipart) {
          System.out.println("Overriding with multipart post");
          streams = requestStreams;
          methodToUse = SolrRequest.METHOD.POST;
        } else {
          methodToUse = request.getMethod();
        }
      } else {
        mustUseMultipart = false;
        methodToUse = request.getMethod();
      }
      
      //System.out.println("Post or put");
      String url = basePath + path;
      /*
      boolean hasNullStreamName = false;
      if (streams != null) {
        for (ContentStream cs : streams) {
          if (cs.getName() == null) {
            hasNullStreamName = true;
            break;
          }
        }
      }
      */
      
      /*
      final boolean isMultipart = ((this.useMultiPartPost && 
SolrRequest.METHOD.POST == methodToUse)
          || (streams != null && streams.size() > 1)) && !hasNullStreamName;
      */
      final boolean isMultipart = this.useMultiPartPost && 
SolrRequest.METHOD.POST == methodToUse &&
          (streams != null && streams.size() >= 1);
          
      System.out.println("isMultipart = "+isMultipart);
      
      LinkedList<NameValuePair> postOrPutParams = new LinkedList<>();

      if(contentWriter != null && !isMultipart) {
        //System.out.println(" using contentwriter");
        String fullQueryUrl = url + toQueryString(wparams, false);
        HttpEntityEnclosingRequestBase postOrPut = SolrRequest.METHOD.POST == 
methodToUse ?
            new HttpPost(fullQueryUrl) : new HttpPut(fullQueryUrl);
        postOrPut.addHeader("Content-Type",
            contentWriter.getContentType());
        postOrPut.setEntity(new BasicHttpEntity(){
          @Override
          public boolean isStreaming() {
            return true;
          }

          @Override
          public void writeTo(OutputStream outstream) throws IOException {
            contentWriter.write(outstream);
          }
        });
        return postOrPut;

      } else if (streams == null || isMultipart) {
        // send server list and request list as query string params
        ModifiableSolrParams queryParams = 
calculateQueryParams(getQueryParams(), wparams);
        queryParams.add(calculateQueryParams(request.getQueryParams(), 
wparams));
        String fullQueryUrl = url + toQueryString(queryParams, false);
        HttpEntityEnclosingRequestBase postOrPut = 
fillContentStream(methodToUse, streams, wparams, isMultipart, postOrPutParams, 
fullQueryUrl);
        return postOrPut;
      }
      // It is has one stream, it is the post body, put the params in the URL
      else {
        String fullQueryUrl = url + toQueryString(wparams, false);
        HttpEntityEnclosingRequestBase postOrPut = SolrRequest.METHOD.POST == 
methodToUse ?
            new HttpPost(fullQueryUrl) : new HttpPut(fullQueryUrl);
        fillSingleContentStream(streams, postOrPut);

        return postOrPut;
      }
    }

    throw new SolrServerException("Unsupported method: " + request.getMethod());

  }

  private void fillSingleContentStream(Collection<ContentStream> streams, 
HttpEntityEnclosingRequestBase postOrPut) throws IOException {
    // Single stream as body
    // Using a loop just to get the first one
    final ContentStream[] contentStream = new ContentStream[1];
    for (ContentStream content : streams) {
      contentStream[0] = content;
      break;
    }
    Long size = contentStream[0].getSize();
    postOrPut.setEntity(new InputStreamEntity(contentStream[0].getStream(), 
size == null ? -1 : size) {
      @Override
      public Header getContentType() {
        return new BasicHeader("Content-Type", 
contentStream[0].getContentType());
      }

      @Override
      public boolean isRepeatable() {
        return false;
      }
    });

  }

  private HttpEntityEnclosingRequestBase fillContentStream(SolrRequest.METHOD 
methodToUse, Collection<ContentStream> streams, ModifiableSolrParams wparams, 
boolean isMultipart, LinkedList<NameValuePair> postOrPutParams, String 
fullQueryUrl) throws IOException {
    HttpEntityEnclosingRequestBase postOrPut = SolrRequest.METHOD.POST == 
methodToUse ?
        new HttpPost(fullQueryUrl) : new HttpPut(fullQueryUrl);

    if (!isMultipart) {
      postOrPut.addHeader("Content-Type",
          "application/x-www-form-urlencoded; charset=UTF-8");
    }

    List<FormBodyPart> parts = new LinkedList<>();
    Iterator<String> iter = wparams.getParameterNamesIterator();
    while (iter.hasNext()) {
      String p = iter.next();
      String[] vals = wparams.getParams(p);
      if (vals != null) {
        for (String v : vals) {
          if (isMultipart) {
            parts.add(new FormBodyPart(p, new StringBody(v, 
StandardCharsets.UTF_8)));
          } else {
            postOrPutParams.add(new BasicNameValuePair(p, v));
          }
        }
      }
    }

    // TODO: remove deprecated - first simple attempt failed, see {@link 
MultipartEntityBuilder}
    if (isMultipart && streams != null) {
      for (ContentStream content : streams) {
        String contentType = content.getContentType();
        if (contentType == null) {
          contentType = BinaryResponseParser.BINARY_CONTENT_TYPE; // default
        }
        String name = content.getName();
        if (name == null) {
          name = "";
        }
        parts.add(new FormBodyPart(encodeForHeader(name),
            new InputStreamBody(
                content.getStream(),
                ContentType.parse(contentType),
                encodeForHeader(content.getName()))));
      }
    }

    System.out.println("Using multipart post!");
    if (parts.size() > 0) {
      ModifiedMultipartEntity entity = new 
ModifiedMultipartEntity(HttpMultipartMode.STRICT, null, StandardCharsets.UTF_8);
      //MultipartEntity entity = new MultipartEntity(HttpMultipartMode.STRICT);
      for (FormBodyPart p : parts) {
        entity.addPart(p);
      }
      postOrPut.setEntity(entity);
    } else {
      //not using multipart
      postOrPut.setEntity(new UrlEncodedFormEntity(postOrPutParams, 
StandardCharsets.UTF_8));
    }
    return postOrPut;
  }

{code}


> Structural changes in SolrJ since version 7.0.0 have effectively disabled 
> multipart post
> ----------------------------------------------------------------------------------------
>
>                 Key: SOLR-12798
>                 URL: https://issues.apache.org/jira/browse/SOLR-12798
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: SolrJ
>    Affects Versions: 7.4
>            Reporter: Karl Wright
>            Priority: Major
>
> Project ManifoldCF uses SolrJ to post documents to Solr.  When upgrading from 
> SolrJ 7.0.x to SolrJ 7.4, we encountered significant structural changes to 
> SolrJ's HttpSolrClient class that seemingly disable any use of multipart 
> post.  This is critical because ManifoldCF's documents often contain metadata 
> in excess of 4K that therefore cannot be stuffed into a URL.
> The changes in question seem to have been performed by Paul Noble on 
> 10/31/2017, with the introduction of the RequestWriter mechanism.  Basically, 
> if a request has a RequestWriter, it is used exclusively to write the 
> request, and that overrides the stream mechanism completely.  I haven't 
> chased it back to a specific ticket.
> ManifoldCF's usage of SolrJ involves the creation of 
> ContentStreamUpdateRequests for all posts meant for Solr Cell, and the 
> creation of UpdateRequests for posts not meant for Solr Cell (as well as for 
> delete and commit requests).  For our release cycle that is taking place 
> right now, we're shipping a modified version of HttpSolrClient that ignores 
> the RequestWriter when dealing with ContentStreamUpdateRequests.  We 
> apparently cannot use multipart for all requests because on the Solr side we 
> get "pfountz Should not get here!" errors on the Solr side when we do, which 
> generate HTTP error code 500 responses.  That should not happen either, in my 
> opinion.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to