Googling doesn't find similar thread except 
https://forums.databricks.com/questions/2095/sparkstreaming-to-process-http-rest-end-point-serv.html,
 and it seems people doing this through custom receiver as well 
(https://old.reddit.com/r/apachespark/comments/6ihdor/best_http_client_for_spark_to_read_from_rest_api/).
 Another option looks like Apache livy, but it's too heavy because we don't 
want to add additional layer.

My requirement is basically the similar to what described in the first thread - 
reading input data such as json, xml, zip file, or something similar from 
RESTful endpoint, then do validation, transformation, etc., pushing result to 
Kafka.

Up to the present is custom receiver still recommended way to achieve this 
effect? Otherwise are there any libs that can accomplish such task?

Thanks

Reply via email to