[ 
https://issues.apache.org/jira/browse/APEXMALHAR-2369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yogi Devendra updated APEXMALHAR-2369:
--------------------------------------
    Description: 
Currently, S3 output is available using S3OutputModule which is restricted for 
copying files from FileSystem to S3. Use-cases where all the tuples/records to 
be written to S3 cannot use this approach. Thus, we need to develop alternative 
module which would take care of writing tuples on S3. 

Design: 
Sending separate requests to S3 for each tuple would be too expensive. This 
module can choose to write tuples to HDFS. And then upload HDFS files to S3. 
This would lead to some end-to-end latency. But, it should OK for the S3 output 
case.

  was:Currently, S3 output is available using S3OutputModule which is 
restricted for copying files from FileSystem to S3. Use-cases where all the 
tuples/records to be written to S3 cannot use this approach. Thus, we need to 
develop alternative module which would take care of writing tuples on S3. 
Design: Sending separate requests to S3 for each tuple would be too expensive. 
This module can choose to write tuples to HDFS. And then upload HDFS files to 
S3. This would lead to some end-to-end latency. But, it should OK for the S3 
output case.


> S3 output module for tuple based output
> ---------------------------------------
>
>                 Key: APEXMALHAR-2369
>                 URL: https://issues.apache.org/jira/browse/APEXMALHAR-2369
>             Project: Apache Apex Malhar
>          Issue Type: Task
>            Reporter: Yogi Devendra
>            Assignee: Yogi Devendra
>
> Currently, S3 output is available using S3OutputModule which is restricted 
> for copying files from FileSystem to S3. Use-cases where all the 
> tuples/records to be written to S3 cannot use this approach. Thus, we need to 
> develop alternative module which would take care of writing tuples on S3. 
> Design: 
> Sending separate requests to S3 for each tuple would be too expensive. This 
> module can choose to write tuples to HDFS. And then upload HDFS files to S3. 
> This would lead to some end-to-end latency. But, it should OK for the S3 
> output case.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to