[ 
https://issues.apache.org/jira/browse/CONNECTORS-1162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14698873#comment-14698873
 ] 

Karl Wright commented on CONNECTORS-1162:
-----------------------------------------

Hi Tugba,

The handling of InterruptedException worries me.  When does Kafka throw these?

When MCF shuts down, it interrupts all worker threads.  When these threads are 
interrupted, they should abort as quickly as possible by throwing the following:

{code}
new ManifoldCFException("interrupted", ManifoldCFException.INTERRUPTED)
{code}

If Kafka only throws InterruptedException when the calling thread is being 
terminated, then you will need to throw this exception instead. Otherwise MCF 
may not shut down properly when sending data to Kafka.




> Apache Kafka Output Connector
> -----------------------------
>
>                 Key: CONNECTORS-1162
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-1162
>             Project: ManifoldCF
>          Issue Type: Wish
>    Affects Versions: ManifoldCF 1.8.1, ManifoldCF 2.0.1
>            Reporter: Rafa Haro
>            Assignee: Karl Wright
>              Labels: gsoc, gsoc2015
>             Fix For: ManifoldCF 2.3
>
>         Attachments: 1.JPG, 2.JPG, Documentation.zip
>
>
> Kafka is a distributed, partitioned, replicated commit log service. It 
> provides the functionality of a messaging system, but with a unique design. A 
> single Kafka broker can handle hundreds of megabytes of reads and writes per 
> second from thousands of clients.
> Apache Kafka is being used for a number of uses cases. One of them is to use 
> Kafka as a feeding system for streaming BigData processes, both in Apache 
> Spark or Hadoop environment. A Kafka output connector could be used for 
> streaming or dispatching crawled documents or metadata and put them in a 
> BigData processing pipeline



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to