Github user markap14 commented on the issue:

    https://github.com/apache/nifi/pull/2493
  
    Hey @david-streamlio this is very cool! I've been thinking about writing 
processors for interacting with Pulsar myself but haven't had a chance yet. 
Just a few things that we should think through a bit:
    
    - Re: Connection Pool in controller service vs. doing it in the processor: 
what makes sense here I think depends on how you expect to use it. If you 
expect to be creating several Pulsar processors with the same connection info, 
then a Controller Service makes sense. If you think the more common case will 
be a single instance of the Processor then configuring it in the Processor is 
probably easier for the user. I think both have their merits though, so I'm 
fine with either approach personally.
    
    - One concern that I have is that with the Kafka processors, we end up 
having to create a new copy of the processors with pretty much each release of 
Kafka, so that we can take advantage of the new features. Have you considered 
how you see this evolving as more versions of Pulsar are released? There are 
two approaches that we often see with NiFi. One is to create a new processor 
for each new version as we did with Kafka. The other is to have a "Client 
Service" Controller service. It would then have methods like publish(FlowFile), 
consume() or something like that. Then there is only a single ConsumePulsar 
processor and a single PublishPulsar processor. Each is then just configured 
with the controller service that handles interacting with Pulsar directly. 
Either approach is okay, I think. But we should probably think about naming at 
least - does it make sense to name these ConsumePulsar_1_20 or 
ConsumePulsar_1_0 or something of that nature? I think it's best to figure this 
part out
  before the initial release because it can then become confusing if we have 
processors like ConsumePulsar and ConsumePulsar_1_35 for instance.
    
    Thoughts?


---

Reply via email to