I have a single source of data. The processing of records have to be directed
to multiple destinations. i.e

1. read the source data
2. based on condition route to the following sources
    1. Kafka for error records
    2. store success records with certain condition in s3 bucket, bucket
name : "A", folder - "a"
    4. store success records with certain condition in s3 bucket, bucket
name : "A", folder - "b"
    3. store success records with certain condition in a different s3 bucket

How can I achieve this in pyspark?

Are there any resources on the design patterns or common industry followed
architectural patterns for apache spark? 

|   Source    | Destination |
| -----------  | -----------  |
| Single       | Single        |
| Multiple     | Single        |
| Single       | Multiple      |
| Multiple     | Multiple      |



--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Reply via email to