I have decided to stream all the data into S3 and use scheduled EMR spark 
jobs to partition the data accordingly, which simplifies the process.

The tricky part is to cap the file sizes so that the spark job can still 
reduce them on a regular basis:  I think that using `take` on the source 
together with a completion timeout should allow me to achieve this.


On Tuesday, September 26, 2017 at 2:47:47 AM UTC+1, David Cromberge wrote:
>
> I have a Kafka topic where messages have an account field, as well as a 
> payload of decimal values. I would like to save the decimal values to an s3 
> file for each account. I would like to preserve at least once message 
> semantics and only commit the offset once I'm sure the contents of the file 
> were accepted by the s3 file sink. 
>
> I would like to know if akka streams are suitable for such a use case. 
> Firstly, if I were to use the Alpakka s3client connector, I would need a 
> way to dynamically create this sink when encountering a message with an 
> account that has not been seen before. 
> Secondly, when using a Kafka connector like Reactive Kafka I would need to 
> pass through the commit offset and somehow commit after passing through the 
> sink. 
>
> I have tried to create a custom graph sink with a concurrent map of sinks, 
> but this did not work out very well. I have also looked at groupby but have 
> not figured out how to feed each subflow to an s3 bucket sink based on its 
> discriminator account. 
>
> I would appreciate any advice on how to progress, I am a relative newcomer 
> to akka streams and am tempted to fallback to an actor based solution, 
> albeit with lack of backpressure etc.

-- 
>>>>>>>>>>      Read the docs: http://akka.io/docs/
>>>>>>>>>>      Check the FAQ: 
>>>>>>>>>> http://doc.akka.io/docs/akka/current/additional/faq.html
>>>>>>>>>>      Search the archives: https://groups.google.com/group/akka-user
--- 
You received this message because you are subscribed to the Google Groups "Akka 
User List" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to akka-user+unsubscr...@googlegroups.com.
To post to this group, send email to akka-user@googlegroups.com.
Visit this group at https://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/d/optout.

Reply via email to