[
https://issues.apache.org/jira/browse/CONNECTORS-1162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14595490#comment-14595490
]
Karl Wright commented on CONNECTORS-1162:
-----------------------------------------
Offhand, you do seem to have a callback to synchronize on:
{code}
send(ProducerRecord<K,V> record, Callback callback)
{code}
So the question is, what information comes back in the callback? And, does
case (3) apply? The reason that is an important question is because there are
a fixed number of ManifoldCF worker threads, and if they are all waiting on a
queue in Kafka, but Kafka is waiting for more documents, then you have a
deadlock situation. So we need to know that, although it is likely you will
find it out if you just try it. ;-)
> Apache Kafka Output Connector
> -----------------------------
>
> Key: CONNECTORS-1162
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1162
> Project: ManifoldCF
> Issue Type: Wish
> Affects Versions: ManifoldCF 1.8.1, ManifoldCF 2.0.1
> Reporter: Rafa Haro
> Assignee: Karl Wright
> Labels: gsoc, gsoc2015
> Fix For: ManifoldCF 1.10, ManifoldCF 2.2
>
> Attachments: 1.JPG, 2.JPG
>
>
> Kafka is a distributed, partitioned, replicated commit log service. It
> provides the functionality of a messaging system, but with a unique design. A
> single Kafka broker can handle hundreds of megabytes of reads and writes per
> second from thousands of clients.
> Apache Kafka is being used for a number of uses cases. One of them is to use
> Kafka as a feeding system for streaming BigData processes, both in Apache
> Spark or Hadoop environment. A Kafka output connector could be used for
> streaming or dispatching crawled documents or metadata and put them in a
> BigData processing pipeline
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)