[GitHub] nifi issue #2671: NiFi-5102 - Adding Processors for MarkLogic DB

2018-11-27 Thread joewitt
Github user joewitt commented on the issue:

https://github.com/apache/nifi/pull/2671
  
If you take a look here 
https://nifi.apache.org/docs/nifi-docs/html/administration-guide.html#core-properties-br
 and checkout the 'nifi.nar.library.directory' section you can see how someone 
could be guided to create a directory for custom nars, add that in there, and 
start up.

Once we have support for extensions in the NiFi Registry this will be 
beautiful/easy.


---


[GitHub] nifi issue #2671: NiFi-5102 - Adding Processors for MarkLogic DB

2018-11-27 Thread ryanjdew
Github user ryanjdew commented on the issue:

https://github.com/apache/nifi/pull/2671
  
@joewitt I understand the concern. We would like to simplify the user 
experience of using NiFi with MarkLogic processors. We have a current model for 
creating and releasing NARs via a GitHub releases page, but would you happen to 
have a good example of a process using Maven publishing with NARs in NiFi?


---


[GitHub] nifi issue #2671: NiFi-5102 - Adding Processors for MarkLogic DB

2018-11-27 Thread joewitt
Github user joewitt commented on the issue:

https://github.com/apache/nifi/pull/2671
  
as a good example where we should have though it through more as well we 
have some cool work done by the InfluxDB folks.  They're now wanting to improve 
it in https://github.com/apache/nifi/pull/2743

But the reality is we just don't have people knowledgeable enough to do 
reliable code reviews/testing of this.



---


[GitHub] nifi issue #2671: NiFi-5102 - Adding Processors for MarkLogic DB

2018-11-27 Thread joewitt
Github user joewitt commented on the issue:

https://github.com/apache/nifi/pull/2671
  
@ryanjdew We have addressed Travis-CI related tests issues and the build 
has been stable now since.  It could break again as people add timing dependent 
tests that behave wildly different on slow environments but we'll see.

This PR, assuming the L and such are sorted now, is just tricky because 
we need a committer with time to devote to learning MarkLogic enough to setup 
an environment and verify function or leverage a provided instance (not a 
sustainable model).  It is a good example where the limits of what the 
community can reasonably support comes into play.

Now, this said this is probably a really cool and useful thing to offer 
folks and beneficial to both the NiFi userbase and MarkLogic user base.  This 
is why I suggested MarkLogic folks just hang onto this code in some public 
github repo and have it be ALv2.  They can publish their nars into maven 
central or wherever they do and provide instructions to it.  I'd be supportive, 
and I'd assume the community at large would, of having links to such extensions 
on the apache website.  This feels to me like the best tradeoff right now for 
all parties.



---


[GitHub] nifi issue #2671: NiFi-5102 - Adding Processors for MarkLogic DB

2018-11-27 Thread ryanjdew
Github user ryanjdew commented on the issue:

https://github.com/apache/nifi/pull/2671
  
Following up, are there any other concerns with this PR? If needed, I can 
provide credentials to a MarkLogic instance for testing.


---


[GitHub] nifi issue #2671: NiFi-5102 - Adding Processors for MarkLogic DB

2018-10-31 Thread joewitt
Github user joewitt commented on the issue:

https://github.com/apache/nifi/pull/2671
  
team; given https://github.com/marklogic/nifi/releases could we consider 
closing this PR and keeping the MarkLogic artifact creation/maintenance 
something MarkLogic takes care of at this time? It is a perfectly fine model.  
We could even create a nifi web page to point at vendor/other community 
managed/supported extensions possibly.


---


[GitHub] nifi issue #2671: NiFi-5102 - Adding Processors for MarkLogic DB

2018-09-11 Thread MikeThomsen
Github user MikeThomsen commented on the issue:

https://github.com/apache/nifi/pull/2671
  
@vivekmuniyandi you have a merge conflict now.


---


[GitHub] nifi issue #2671: NiFi-5102 - Adding Processors for MarkLogic DB

2018-05-14 Thread vivekmuniyandi
Github user vivekmuniyandi commented on the issue:

https://github.com/apache/nifi/pull/2671
  
@joewitt I have addressed all your comments except for the License and 
Notice comments. Can you please let us know what more should we add apart from 
the LICENSE and NOTICE file prepared by our legal team which we have included 
in the root directory of the nar? That constitutes for all the dependecies 
added. What more should be added? Please help. Thanks.


---


[GitHub] nifi issue #2671: NiFi-5102 - Adding Processors for MarkLogic DB

2018-05-14 Thread vivekmuniyandi
Github user vivekmuniyandi commented on the issue:

https://github.com/apache/nifi/pull/2671
  
@MikeThomsen Sure, will add that to our backlog. Thanks!


---


[GitHub] nifi issue #2671: NiFi-5102 - Adding Processors for MarkLogic DB

2018-05-14 Thread MikeThomsen
Github user MikeThomsen commented on the issue:

https://github.com/apache/nifi/pull/2671
  
@vivekmuniyandi Unrelated, but I don't have your email: consider adding a 
MarkLogicLookupService in a future sprint. You can look at HBaseLookupService 
and MongoDBLookupService as examples. Might be highly useful to users to be 
able to enrich a record set using MarkLogic. I have PRs open for some 
LookupService-related tasks that add some additional schema-related 
capabilities and those might be useful to your team on that issue.


---


[GitHub] nifi issue #2671: NiFi-5102 - Adding Processors for MarkLogic DB

2018-05-10 Thread vivekmuniyandi
Github user vivekmuniyandi commented on the issue:

https://github.com/apache/nifi/pull/2671
  
Thanks @joewitt for the comment. I am addressing all the changes you have 
mentioned. I will address the SSLContext and remove the Kerberos and 
Certificate auth for now. 

``` The other thing that needs to happen is the nar bundles need their 
LICENSE/NOTICE file(s) added if necessary. I looked at one of the nars and 
there would definitely need to be entries.```

Wrt this, I have a 
[LICENSE](https://github.com/apache/nifi/pull/2671/files#diff-53deed39bf31085fbecf77ea6a2382dc)
 and 
[NOTICE](https://github.com/apache/nifi/pull/2671/files#diff-a2f6b487a7a70d5f43fa320730b2c87a)
 file prepared by our legal team in the root directory (to account for the 
contents of the root directory and the sub directories) of the nifi marklogic 
bundle. That constitutes for all the dependecies added in our MarkLogic bundle. 
Should we do something more? Can you explain a bit more here as to what is 
required? 

Thanks for all the help?


---


[GitHub] nifi issue #2671: NiFi-5102 - Adding Processors for MarkLogic DB

2018-05-09 Thread joewitt
Github user joewitt commented on the issue:

https://github.com/apache/nifi/pull/2671
  
Ok i've attached a patch which helps with some aspects of POM construction, 
flagging things like resource utilization since it appears to be loading full 
content into memory, and renaming the service to indicate it is a MarkLogic 
service rather than just a database service.  There is an outstanding need to 
sort out the security configuration.  For SSLContext stuff those things should 
utilize the standard mechanism of obtaining that as you can follow from a 
number of other processors.  Also, there is a kerberos context for security 
setting but there does not appear to be any associated settings for the user.  
The security configurations should be removed in favor of simple/digest for now 
OR completed and with some consistency to other items.  For security relevant 
things CVEs become a concern so we take these more seriously.  For things about 
the performance/logic of the processor interaction with MarkLogic that we can 
improve over time if needed but security we want to get right 
 up front.  The other thing that needs to happen is the nar bundles need their 
LICENSE/NOTICE file(s) added if necessary.  I looked at one of the nars and 
there would definitely need to be entries.  Please try adding these in like 
other nars and I'm happy to help tweak it to get it to the finish line.

If you have questions on how to achieve any of the above please ask.  Show 
an example nar you looked at which is similar so that we can best help close 
remaining gaps but from a place of good examples that you've looked at.

Thanks


---


[GitHub] nifi issue #2671: NiFi-5102 - Adding Processors for MarkLogic DB

2018-05-07 Thread vivekmuniyandi
Github user vivekmuniyandi commented on the issue:

https://github.com/apache/nifi/pull/2671
  
@MikeThomsen We don't have a public MarkLogic Docker image but we do have 
[this](https://hub.docker.com/r/patrickmcelwee/marklogic-dependencies/) on 
Docker Hub which would give you a head start on having a MarkLogic instance up 
and running. 

I will drop them an email and I will work on getting a secure access to the 
cluster. Thanks for all the help. 


---


[GitHub] nifi issue #2671: NiFi-5102 - Adding Processors for MarkLogic DB

2018-05-07 Thread MikeThomsen
Github user MikeThomsen commented on the issue:

https://github.com/apache/nifi/pull/2671
  
@vivekmuniyandi I'll try to find time to set up a MarkLogic node for 
testing some time in the next few days (day job and such is getting in the 
way). In the mean time, I would suggest reaching out to @joewitt directly to 
see if he or any of his folks have any time they can spare to jump in and help 
you out. Also, just a suggestion, you might want to think about setting up a 
secure cluster that you can privately allow reviewers to access so we/they can 
work with you to confirm everything works the way you expect.


---


[GitHub] nifi issue #2671: NiFi-5102 - Adding Processors for MarkLogic DB

2018-05-04 Thread vivekmuniyandi
Github user vivekmuniyandi commented on the issue:

https://github.com/apache/nifi/pull/2671
  
Thanks @MikeThomsen for the comment. `PutMarkLogicRecord` is definitely on 
our roadmap but I am not sure when we will be able to get to it. We have an 
internal sprint for NiFi. We will add this to our backlog, check with PM and 
address this with priority. 

We don't want to keep this PR waiting for that processor. I would raise a 
separate PR in the future for that. Thanks!


---


[GitHub] nifi issue #2671: NiFi-5102 - Adding Processors for MarkLogic DB

2018-05-04 Thread MikeThomsen
Github user MikeThomsen commented on the issue:

https://github.com/apache/nifi/pull/2671
  
BTW, I would strongly recommend your team discuss adding a 
`PutMarkLogicRecord` processor so you can do a bulk ingestion invent from a 
single flowfile. We have quite a few good implementations such as ones for 
HBase, ElasticSearch and MongoDB that you can use/steal from to make it happen. 
Would **strongly** recommend you do that because it'll make bulk ingestion of 
very large data sets go much faster for MarkLogic. If you want to do that, feel 
free to just start work on it and push changes into this PR and we'll just keep 
going.


---


[GitHub] nifi issue #2671: NiFi-5102 - Adding Processors for MarkLogic DB

2018-05-04 Thread MikeThomsen
Github user MikeThomsen commented on the issue:

https://github.com/apache/nifi/pull/2671
  
@vivekmuniyandi I'll try to build a MarkLogic Docker image and share it on 
Docker Hub so others can use that if they want.


---


[GitHub] nifi issue #2671: NiFi-5102 - Adding Processors for MarkLogic DB

2018-05-03 Thread vivekmuniyandi
Github user vivekmuniyandi commented on the issue:

https://github.com/apache/nifi/pull/2671
  
Thanks @MikeThomsen ! Have made the changes. 


---


[GitHub] nifi issue #2671: NiFi-5102 - Adding Processors for MarkLogic DB

2018-05-03 Thread vivekmuniyandi
Github user vivekmuniyandi commented on the issue:

https://github.com/apache/nifi/pull/2671
  
I followed this link - 
https://cwiki.apache.org/confluence/display/NIFI/Contributor+Guide#ContributorGuide-Keepingyourfeaturebranchcurrent

and looks like this pulled other's commits as well.


---


[GitHub] nifi issue #2671: NiFi-5102 - Adding Processors for MarkLogic DB

2018-05-03 Thread MikeThomsen
Github user MikeThomsen commented on the issue:

https://github.com/apache/nifi/pull/2671
  
@vivekmuniyandi looks like you pulled a few other folks' commits in with 
your last push. Do this to clear that up:

1. git checkout master
2. git pull  master
3. git checkout nifi-5102
4. git rebase master
5. git push marklogic --force nifi-5102

You probably did a pull on master into nifi-5102. You want to avoid that 
for your own sanity's sake and use a rebase instead.


---