Re: Doco, PutSolrContentStream

2022-02-24 Thread Nathan Gough
Hi Dwane,

If possible would you be able to test out the PR here and see if it fixes
your issue? https://github.com/apache/nifi/pull/5727

You can checkout the PR and build the code yourself in the
nifi/nifi-nar-bundles/nifi-solr-bundle/nifi-solr-nar directory or you can
backup your current nifi-solr-nar-1.16.0-SNAPSHOT.nar in the ./nifi/lib
directory and download and replace with the one I built here:
https://easyupload.io/74ujf4.

Thanks,
Nathan


On Fri, Jan 28, 2022 at 12:42 PM Andrew Lim 
wrote:

> Hi Dwane,
>
> Thanks for finding and reporting those documentation errors. I filed a
> Jira [1] to fix those.
>
> It looks like the change to the default value of
> nifi.provenance.repository.rollover.time was made in 1.12.0 [2]. I will see
> if we can improve the docs to give more context to why this was done as
> part of [1].
>
> -Drew
>
>
> [1] https://issues.apache.org/jira/browse/NIFI-9642 <
> https://issues.apache.org/jira/browse/NIFI-9642>
> [2] https://issues.apache.org/jira/browse/NIFI-7339 <
> https://issues.apache.org/jira/browse/NIFI-7339>
>
>
> > On Jan 28, 2022, at 7:56 AM, Nathan Gough  wrote:
> >
> > Hi Dwane,
> >
> > I've created a Jira issue to test and rectify the Solr + ZooKeeper
> issue: https://issues.apache.org/jira/browse/NIFI-9641 <
> https://issues.apache.org/jira/browse/NIFI-9641>
> >
> > Thanks for the report!
> > Nathan
> >
> > On Fri, Jan 28, 2022 at 7:22 AM Dwane Hall  > wrote:
> > Hey NiFi community I hope all is well with everyone wherever they may
> be.  I recently updated our NiFi instances from 1.11.4 to 1.15.3 and have
> made a few observations from this process worth mentioning.
> >
> > Some minor documentation inconsistencies
> > A couple of the default values appear to have changed in nifi.properties
> through versions (listed below are the old and new values along with links
> to the documentation).
> >
> >
> https://nifi.apache.org/docs/nifi-docs/html/administration-guide.html#write-ahead-flowfile-repository
> <
> https://nifi.apache.org/docs/nifi-docs/html/administration-guide.html#write-ahead-flowfile-repository
> >
> > “The FlowFile Repository checkpoint interval. The default value is 2
> mins.” [new default value is 20 secs]
> > 1.11.4 nifi.flowfile.repository.checkpoint.interval=2 mins
> > 11.15.3 nifi.flowfile.repository.checkpoint.interval=20 secs
> >
> >
> https://nifi.apache.org/docs/nifi-docs/html/administration-guide.html#persistent-provenance-repository-properties
> <
> https://nifi.apache.org/docs/nifi-docs/html/administration-guide.html#persistent-provenance-repository-properties
> >
> > “The amount of time to wait before rolling over the latest data
> provenance information so that it is available in the User Interface. The
> default value is 30 secs.”
> >
> https://nifi.apache.org/docs/nifi-docs/html/user-guide.html#system-properties
> <
> https://nifi.apache.org/docs/nifi-docs/html/user-guide.html#system-properties
> >
> > “If processing a high volume of events, change
> nifi.provenance.repository.rollover.time from a default of 30 secs to 1 min
> and ...” [The new default value is 10 min].
> > 1.11.4 nifi.provenance.repository.rollover.time=30 sec
> > 1.15.3 nifi.provenance.repository.rollover.time=10 min
> > This seems to be a significant change was there any reason for this new
> default setting I was unable to find documentation referencing the increase?
> >
> > PutSolrContentStream processor issues
> >
> > Secondly after a successful upgrade I noticed our use of the
> PutSolrContentStream processor had broken.  Looking through the processor
> code there was an upgrade to the SolrJ client and a commit in March 2020
> (and referenced below) that appears to prevent nested zk chroot paths for
> SolrCloud connections (i.e. the zookeeper connection string is truncated).
> >
> > SolrUtils.java (nifi/SolrUtils.java at master · apache/nifi · GitHub <
> https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-solr-bundle/nifi-solr-processors/src/main/java/org/apache/nifi/processors/solr/SolrUtils.java
> >)
> > The commit of intrest regarding the new process for initiating a
> CloudSolrClient in SolrJ
> >
> https://github.com/apache/nifi/commit/9b4292024be6fae188cb1efa3a07dc9489e9a5b4#diff-13320e5b198f236cea296fb01cb7376755d65c444678e781fa0940c2a28db88b
> <
> https://github.com/apache/nifi/commit/9b4292024be6fae188cb1efa3a07dc9489e9a5b4#diff-13320e5b198f236cea296fb01cb7376755d65c444678e781fa0940c2a28db88b
> >
> >
> > For a nested Solr path "/solr/PROD", "/solr/DEV", "/solr/DR" … the
> string is truncated to the base path only i.e. “/solr” (this is only an
> issue for nested chroots)
> >
> > The code of interest is here in the SolrUtils.java class
> >
> > if
> (SOLR_TYPE_STANDARD.getValue().equals(context.getProperty(SOLR_TYPE).getValue()))
> {
> > return new HttpSolrClient(solrLocation, httpClient);
> > return new
> HttpSolrClient.Builder(solrLocation).withHttpClient(httpClient).build();
> >   

Re: Doco, PutSolrContentStream

2022-01-28 Thread Andrew Lim
Hi Dwane,

Thanks for finding and reporting those documentation errors. I filed a Jira [1] 
to fix those.

It looks like the change to the default value of 
nifi.provenance.repository.rollover.time was made in 1.12.0 [2]. I will see if 
we can improve the docs to give more context to why this was done as part of 
[1].

-Drew


[1] https://issues.apache.org/jira/browse/NIFI-9642 

[2] https://issues.apache.org/jira/browse/NIFI-7339 



> On Jan 28, 2022, at 7:56 AM, Nathan Gough  wrote:
> 
> Hi Dwane,
> 
> I've created a Jira issue to test and rectify the Solr + ZooKeeper issue: 
> https://issues.apache.org/jira/browse/NIFI-9641 
> 
> 
> Thanks for the report!
> Nathan
> 
> On Fri, Jan 28, 2022 at 7:22 AM Dwane Hall  > wrote:
> Hey NiFi community I hope all is well with everyone wherever they may be.  I 
> recently updated our NiFi instances from 1.11.4 to 1.15.3 and have made a few 
> observations from this process worth mentioning.
>  
> Some minor documentation inconsistencies
> A couple of the default values appear to have changed in nifi.properties 
> through versions (listed below are the old and new values along with links to 
> the documentation). 
>  
> https://nifi.apache.org/docs/nifi-docs/html/administration-guide.html#write-ahead-flowfile-repository
>  
> 
> “The FlowFile Repository checkpoint interval. The default value is 2 mins.” 
> [new default value is 20 secs]
> 1.11.4 nifi.flowfile.repository.checkpoint.interval=2 mins
> 11.15.3 nifi.flowfile.repository.checkpoint.interval=20 secs
>  
> https://nifi.apache.org/docs/nifi-docs/html/administration-guide.html#persistent-provenance-repository-properties
>  
> 
> “The amount of time to wait before rolling over the latest data provenance 
> information so that it is available in the User Interface. The default value 
> is 30 secs.”
> https://nifi.apache.org/docs/nifi-docs/html/user-guide.html#system-properties 
> 
> “If processing a high volume of events, change 
> nifi.provenance.repository.rollover.time from a default of 30 secs to 1 min 
> and ...” [The new default value is 10 min].
> 1.11.4 nifi.provenance.repository.rollover.time=30 sec
> 1.15.3 nifi.provenance.repository.rollover.time=10 min
> This seems to be a significant change was there any reason for this new 
> default setting I was unable to find documentation referencing the increase?
>  
> PutSolrContentStream processor issues
>  
> Secondly after a successful upgrade I noticed our use of the 
> PutSolrContentStream processor had broken.  Looking through the processor 
> code there was an upgrade to the SolrJ client and a commit in March 2020 (and 
> referenced below) that appears to prevent nested zk chroot paths for 
> SolrCloud connections (i.e. the zookeeper connection string is truncated).
>  
> SolrUtils.java (nifi/SolrUtils.java at master · apache/nifi · GitHub 
> )
> The commit of intrest regarding the new process for initiating a 
> CloudSolrClient in SolrJ 
> https://github.com/apache/nifi/commit/9b4292024be6fae188cb1efa3a07dc9489e9a5b4#diff-13320e5b198f236cea296fb01cb7376755d65c444678e781fa0940c2a28db88b
>  
> 
>  
> For a nested Solr path "/solr/PROD", "/solr/DEV", "/solr/DR" … the string is 
> truncated to the base path only i.e. “/solr” (this is only an issue for 
> nested chroots)
>  
> The code of interest is here in the SolrUtils.java class
>  
> if 
> (SOLR_TYPE_STANDARD.getValue().equals(context.getProperty(SOLR_TYPE).getValue()))
>  {
> return new HttpSolrClient(solrLocation, httpClient);
> return new 
> HttpSolrClient.Builder(solrLocation).withHttpClient(httpClient).build();
> } else {
> // CloudSolrClient.Builder now requires a List of ZK addresses 
> and znode for solr as separate parameters
> final String zk[] = solrLocation.split("/");
> final List zkList = Arrays.asList(zk[0].split(","));
> String zkRoot = "/";
> if (zk.length > 1 && ! zk[1].isEmpty()) {
> zkRoot += zk[1];
> }
>   
> 
>  
>
> I think the issue can be resolved by changing this line of code which should 
>